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ABSTRACT 

This  thesis  presents  Cost  Estimating  Relationships  (CERs)  for  fighter  aircraft. 
Since  the  fighter  aircraft  is  one  of  the  most  important  tactical  weapon  systems,  it  is 
very  useful  to  establish  CERs  solely  for  fighter  aircraft.  Using  the  public  data  on  U.S. 
fighter  aircraft,  Ordinary  Least  Squares  (OLS)  is  used  as  the  primary  statistical  method 
of  establishing  CERs.  The  data  collection  techniques  and  adjustments  used  are 
discussed,  and  simple  and  multiple  linear  regressions  are  performed  on  various 
combinations  of  the  explanatory  variables.  This  thesis  then  shows  that  CERs  based  on 
new  fighter  aircraft  data  are  more  reliable  than  those  based  on  new  and  old  fighter 
aircraft  data. 
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I.  INTRODUCTION 

A  parametric  cost  estimate  has  been  defined  as  an  estimate  which  predicts  cost 
by  means  of  explanatory  variables  such  as  performance  characteristics,  physical 
characteristics,  and  characteristics  relevant  to  the  development  process,  as  derived  from 
experience  on  logically  related  systems  [Ref.  1:  p. 72].  It  is  based  on  the  assumption 
that  the  past  is  somehow  a  reliable  guide  to  the  future,  which  means  the  estimation 
captures  the  relationship  between  past  experience  and  future  application. 

The  cost  estimation  of  military  hardware  uses  experience  on  existing  equipment 
to  predict  the  cost  of  next-generation  weapons.  Traditionally,  acquisition  of  next- 
generation  weapons  requires  substantial  costs.  In  the  past,  however,  cost  was  not 
always  a  major  consideration  in  choosing  the  weapons.  To  save  money  in  the  long-run 
and  operate  within  a  tighter  budget,  costs  must  be  reliably  estimated  during 
requirements  formulation  in  determining  which  weapon  provides  the  best  value  in 
fulfilling  mission  needs. 

Cost  Estimating  Relationships  (CERs)  are  mathematical  equations  which  relate 
system  costs  as  a  function  of  various  explanatory  variables.  They  are  most  generally 
derived  through  statistical  regression  analysis  of  historical  cost  data.  The  construction 
and  use  of  CERs  forms  the  foundation  for  making  independent  parametric  cost 
estimates  [Ref.  2:  p. 2]. 

A.       THESIS  OBJECTIVE 

Developing  new  CERs  for  fighter  aircraft  is  the  major  objective  of  this  thesis.  In 
fact,  there  are  several  cost  estimating  methods  and  CERs  for  aircraft.  This  thesis  will 
discuss  the  statistical  approachs  and  the  CERs  for  fighter  aircraft  only  using 
explanatory  variables  such  as  thrust,  weight,  etc. 

This  thesis  also  has  objectives  related  to  the  goal  of  developing  new  CERs.  They 
are: 

1)  To  research  currently  developed  CERs  based  on  historical  data.  There  are 
many  CERs  which  were  developed  in  previous  periods.  They  may  be  used  by 
an  experienced  analyst  and  study  of  them  will  be  helpful  to  develop  new 
CERs. 


2)  To  present  data  collection  and  adjustment  approaches.  Collecting  the  right 
data  and  adjusting  the  collected  data  are  required  in  order  to  develop  CERs. 
Data  imperfections  are  frequently  encountered  difficulties  in  weapon  system 
cost  estimation. 

3)  To  apply  alternative  statistical  methods.  CERs  that  use  explanatory'  variables 
are  relied  upon  to  predict  the  cost  at  a  high  level  of  aggregation.  The 
statistical  techniques  can  be  used  in  a  variety  of  situations,  but  not  for  all 
situations.  They  will  vary  according  to  the  purpose  of  the  study  and  the 
information  available. 

4)  To  apply  CERs.  By  using  newly  developed  CERs,  it  may  be  possible  to 
predict  the  costs  of  fighter  aircraft.  Also,  it  may  be  possible  to  estimate  the 
costs  of  international  fighter  aircraft  from  this  CER. 

B.       WHY  DO  THIS  ? 

Korea  (South)  knows  the  misery  of  war  as  a  result  of  the  Korean  War 
(1950-1953)  and  wishes  to  live  in  peace  forever.  However,  North  Korea  is  a  belligerent 
communist  country.  Therefore,  as  a  deterrent  to  an  all-out  war,  Korea  has  to  have 
high  defense  capabilities.  Maintenance  of  a  strong  defense  force  is  one  of  the  most 
reliable  ways  to  keep  the  peace. 

Ownership  of  superior  weapon  systems  is  one  of  the  best  methods  of  maintaining 
strong  defenses.  Fighter  aircraft  are  one  of  the  most  powerful  weapon  systems 
developed  for  modern  warfare.  However,  fighter  aircraft  acquisition  is  extremely 
expensive.  Since  excessive  spending  for  defense  will  check  national  development,  the 
choice  between  systems  must  be  seriously  considered. 

Korea  is  still  a  developing  country  and  is  currently  one  of  the  major  weapon 
importing  countries.  Nevertheless,  the  economic  growth  of  Korea  is  worthy  of  close 
attention.  Korea's  economy  has  been  growing  at  an  increasing  rate  for  more  than 
twenty  years.  As  a  result,  Korea  is  now  changing  from  a  weapon  importing  country  to 
a  weapon  producing  country- 

At  this  time,  it  could  be  meaningful  to  develop  new  CERs  for  fighter  aircraft. 
CERs  are  based  on  readily  available  explanatory  variables,  so  they  allow  the  decision 
maker  to  evaluate  the  cost  impact  of  future  designs  and  make  trade-offs  accordingly. 
After  acquisition,  the  potential  use  of  these  CERs  still  exists.  They  may  be  used  as 
validated  CERs  the  next  time.  However,  since  the  earlier  CERs  are  out  of  date  in  that 
they  did  not  include  the  newest  data,  developing  new  CERs  is  necessary. 


Korea's  particular  interests  regarding  fighter  aircraft  are  weight,  speed,  and 
electronic  equipment.  As  a  defense  force,  fighter  aircraft  must  be  sufficiently 
lightweight  that  they  can  be  used  quickly  to  react  against  attacking  aircraft.  However, 
as  interceptors,  fighter  aircraft  have  to  have  high  speed  capability  and  superior 
electronic  equipment  in  order  to  intercept  targets.  Therefore,  fighter  aircraft  must  be 
lightweight,  yet  be  able  to  reach  speeds  of  at  least  mach  2.0,  and  earn-  the  newest 
superior  electronic  equipment.  Fighter  aircraft  such  as  the  F-16  or  F-18.  for  example, 
are  the  most  suitable  types  for  Korea. 

C.       ORGANIZATION 

Chapter  II  introduces  some  of  the  CERs  that  have  been  developed  for  aircraft. 
Chapter  III  deals  with  the  data  collection  and  adjustment.  Chapter  IV  concerns  the 
statistical  approach  and  includes  a  discussion  of  the  ordinary  least-squares  method  as  a 
regression  technique.  Chapter  V  deals  with  the  analysis  of  the  established  models  and 
includes  a  description  of  the  prediction  analysis  which  estimates  the  costs  of  an 
international  fighter  aircraft  from  the  CERs  of  U.S.  fighter  aircraft.  Finally,  Chapter 
VI  offers  conclusions  regarding  the  interpretation  of  selected  CERs. 
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II.  PRIOR  FIGHTER  AIRCRAFT  CERs 

As  implied  earlier,  CERs  are  based  on  historical  data.  These  CERs  are  no  better 
than  the  data  on  which  the  CERs  are  based.  Therefore,  reviewing  some  of  developed 
methods  and  models  may  have  a  beneficial  effect  on  developing  new  CERs. 

Many  organizations  have  developed  cost  models,  and  different  techniques  have 
been  employed.  Through  the  years  the  Rand  Corporation  has  organized  and  updated 
the  Department  of  Defense  (DOD)  data  base  for  airframe  costs,  identifying  the 
deficiencies  and  correcting  them  where  possible,  mainly  in  support  of  Air  Force 
sponsored  research  efforts. 

"A  Computer  Model  for  Estimating  Development  and  Procurement  Costs  of 
Aircraft  (DAPCA-III)",  which  was  published  in  1976,  is  one  of  Rand's  aircraft  airframe 
cost  models  [Ref.  3].  It  is  based  on  a  sample  of  twenty-five  U.S.  military  aircraft 
including  fighter,  attack,  bomber,  and  cargo  aircraft.  The  model  uses  CERs  to  estimate 
the  development  and  procurement  costs  of  two  major  flyaway  subsystems  of  the 
aircraft:  airframe  and  engines.  Avionics  costs  are  included  in  the  model  but  are  not 
derived  parametrically.  These  costs,  however,  do  not  quite  constitute  the  total  system 
cost  of  the  aircraft. 

Table  1  shows  the  CERs  used  in  DAPCA-III.  They  are  based  on  the  cost  of 
total  production  quantity  of  200  units  including  prototype  aircraft.  For  those  aircraft 
whose  total  production  quantity  is  less  than  200  units,  the  cost-quantity  relationship  or 
learning  curve  is  used  in  order  to  obtain  a  value  at  that  quantity.  CERs  used  in  the 
model  are  based  on  log-linear  regressions  {they  are  shown  in  the  power  form).  The 
major  explanatory  variables  are  airframe  unit  weight  and  maximum  speed  at  the  best 
altitude.  Additionally,  the  time  of  first  flight  in  calendar  quarters  after  1942  is  found  to 
be  a  significant  explanatory  variable  for  recurring  manufacturing  labor  and  materials, 
and  improves  the  statistical  properties  of  the  equation.  Thus,  equations  with  and 
without  the  time  variable  were  considered  separately.  Also,  the  dummy  variable 
designates  whether  cargo  or  noncargo  aircraft  were  used  for  flight  test  cost. 

Costs  are  provided  in  seven  categories:  total  engineering  hours,  total  tooling 
hours,  nonrecurring  manufacturing  labor  hours,  recurring  manufacturing  labor  hours, 
nonrecurring  manufacturing  material  costs,  recurring  manufacturing  material  costs,  and 
flight  test  costs.   All  costs  used  in  the  model  are  in  constant  1975  dollars. 
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TABLE  1 
SELECTED  CERS  FROM  THE  DAPCA-III  MODEL 

E  =  20.032  •  W0-6636  •  S0-9871  •  200-<b+1>  •  Qb+1  •  10"6 

T  =  522.39  •  W0-6214  •  S0-5323  •  200<b  +  1)  •  Qb+1  •  10"6 

MLNR  =  0.62597  •  W0-6883  •  S1,2109  •  10-6 

MLR  =   1188.5  •  W0-8306  •  S0-5464  •  r0-4711  •  200-(b+1)  •  Qb+1  •  10'6 

MLR  =  5S1.55  •  W0-7830  •  S0-4297  •  20<r<b  +  1)  •  Qb+1  •  10'6 

MMNR  =  0.030614  •  W0-7240  ♦  SL9240  •  10"6 

MMR  =  93.409  •  W0'8121  •  S0"6951  •  T0"4744  •  200"(b+1>  •  Qb+1  ♦  10"6 

MMR  =   191.85  •  W1'8600  •  S0'8126  •  200"(b+1)  •  Qb+1  •  10'6 

FT  =   153.25  •  W0-7095  •  S0-5856  •  QFT°-7160  •  DVL5570  •  10"6 

where: 

E  =  total  engineering  hours  (millions) 

T  =  total  tooling  hours  (millions) 

ML^-j^  =  nonrecurring  manufacturing  labor  hours  (millions) 

MLR  =  recurring  manufacturing  labor  hours  (millions) 

MM^-j^  =  nonrecurring  manufacturing  materials  cost  (millions  of  1975  dollars) 

MMn  =  recurring  manufacturing  materials  cost  (millions  of  1975  dollars) 

FT  =  flight  test  cost  (millions  of  1975  dollars) 

W  =  airframe  unit  weight  (lb) 

S  =  maximum  speed  at  best  altitude  (kts) 

Q  =  airframe  quantity 

b  =  exponent  corresponding  to  cumulative  average  learning  curve  slope 

T  =  time  of  first  flight  (calendar  quarters  after  1942  =  4  •  [input  date-  1942.75]  ) 

Qpy  =  number  of  flight  test  aircraft 

DV  =  dummy  variable  (1  for  noncargo,  2  for  cargo  aircraft) 


DAPCA-III  is  a  meaningful  model  for  use  as  a  long-range  planning  tool  for 
normal,  full  scale  production  programs.  However,  the  model  is  based  on  a  sample  of 
several  different  types  of  military  aircraft.  A  cost  model  based  on  a  more 
homogeneous  data  sample  is  the  result  of  the  work,  of  J.  Large.  It  presents  a 
parametric  cost  model  for  fighter  aircraft  only  [Ref.  4]. 
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Large's  "A  Comparison  of  Cost  Models  for  Fighter  Aircraft",  which  was 
published  in  1977,  is  another  of  Rand's  aircraft  cost  models  and  is  referred  to  as  the 
Large  model  [Ref.  4].  It  derives  CERs  to  estimate  the  fighter  aircraft  cost  only.  There 
are  two  types  of  CERs  in  the  model.  One  is  derived  from  a  sample  of  seventeen  U.S. 
military  fighter  aircraft  only,  while  the  other  is  derived  from  a  sample  of  thirty-one 
different  types  of  aircraft.  The  larger  sample  fighter  aircraft  data  includes  several  older 
fighter  aircraft  as  well  as  new  fighter  aircraft. 

Table  2  shows  the  CERs  based  on  a  sample  of  fighter  aircraft  only.  They  are 
based  on  cumulative  total  production  quantity  of  100  units.  Like  DAPCA-III,  the 
most  reliable  explanatory  variables  are  airframe  unit  weight  and  maximum  speed. 
Additionally,  the  model  afforded  an  opportunity  to  examine  an  explanatory  variable 
that  was  thought  to  have  special  applicability  to  fighter  aircraft.  It  is  referred  to  as  the 
specific  power  (P)  and  represented  as 

(static  thrust )( max  speed) 
P  =  0.003069  x  - 


combat  weight 

Both  speed  and  specific  power  were  considered  separately  along  with  weight  and  other 
variables  in  the  regression  analyses,  for  comparison  purposes. 

Costs  are  provided  in  seven  different  categories:  cumulative  total  engineering 
hours,  cumulative  total  tooling  hours,  development  support  cost,  flight  test  cost, 
cumulative  recurring  manufacturing  hours,  cumulative  recurring  manufacturing 
materials  cost,  and  cumulative  recurring  quality  control  hours.  Then,  in  order  to 
accommodate  the  less  detailed  older  data,  two  of  the  cost  categories  in  DAPCA-III  -- 
nonrecurring  labor  and  materials--  are  combined  into  a  single  category,  development 
support.   All  costs  used  in  the  model  are  in  constant  1973  dollars. 

The  Large  model,  as  a  model  based  on  fighter  aircraft  only,  compares  the  CERs 
for  fighter  aircraft  with  the  CERs  for  different  types  of  aircraft,  and  with  the  CERs 
used  in  DAPCA-III.  However,  since  the  model  was  published  in  earlier  times,  the  cost 
information  for  older  aircraft  are  less  reliable  than  for  later  aircraft,  and  the 
development  and  production  experience  of  these  earlier  aircraft  are  not  considered  an 
appropriate  indicator  of  the  future.  Furthermore,  as  in  DAPCA-III,  CERs  used  in  the 
model  make  use  of  subsystem  characteristics  in  order  to  estimate  the  costs  of  airframe, 
engines,  etc.  Therefore,  it  would  be  desirable  to  develop  new  CERs  which  are  based  on 
recent  aircraft  data  and  make  use  of  overall  aircraft  characteristics. 
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TABLE  2 
SELECTED  CERS  FROM  THE  LARGE  MODEL 


E100  =  0.000015  •  WL14-  SL29 

E10Q  =  0.0276-  W1,24'  P0,72 

T,00  =  0.0583  •  W0-657  •  S0-760 

T        =  4  754-  w0-715  ♦  P0'446 
MOO        ~t-/--+      vv  r 

ML100  =  0.097- W1'01  •  S0J06 
ML100  =  0.878  -W0-986-  P0'246 
MM100  =  0.0011  ♦  W1-08  •  S1'11 
MM100  =  0.404  •  W1,23  •  P0-567 
DS  =  0.00032  •  W1,17  •  S0-63  •  FTA1'10 
DS  =  0.037  •  WL13  •  P0-53  •  FTA0,98 
FT  =  0.00104  •  W0-65  •  SL14  •  FTA1-22 
FT  =   1.053  •  W0-72  •  P0-71  •  FTAL16 


Qcioo 

=  0.00029  •  W0'64  •  S1-35 

Qcioo  = 

=  0.0321  ♦  WL08  ♦  P0-57 

where: 

E100  =  cumulative  total  engineering  hours  at  100  aircraft  (thousands) 

TJ00  =  cumulative  total  tooling  hour  at  100  aircraft  (thousands) 

MLj00  =  cumulative  recurring  manufacturing  labor  hour  at  100  aircraft  (thousands) 

MMjqq  =  cumulative  recurring  materials  cost  at  100  aircraft  (thousands  of  1973  dollars) 

DS  =  development  support  cost  (thousands  of  1973  dollars) 

FT  =  (light  test  cost  (thousands  of  1973  dollars) 

QC.qq  =  cumulative  recurring  quality  control  hours  at  100  aircraft  (thousands) 

W  =  airframe  unit  weight  (lb) 

S  =  maximum  speed  (kts) 

P  =  specific  power  (hp,  lb) 

FTA  =  number  of  flight  test  aircraft 

"Cost  Estimating  Relationships  for  Tactical  Combat  Aircraft",  which  was 
published  by  IDA  (Institute  for  Defense  Analyses)  in  1984,  is  one  of  the  most  current 
cost  models  for  tactical  combat  aircraft  and  is  referred  to  as  the  IDA  model  [Ref.  5].  It 
is  based  on  a   sample  of  twenty-six   U.S.   military  aircraft:   fighter,  attack,   bomber 
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aircraft,  etc.  However,  seven  fighter  and  attack,  aircraft  are  used  to  develop  the  CERs 
for  RDT&E  (Research,  Development,  Test  and  Engineering)  cost,  and  fourteen  fighter 
and  attack  aircraft  for  procurement  cost. 

Table  3  shows  CERs  used  in  the  IDA  model.  They  are  developed  to  estimate  the 
RDT&E  and  procurement  costs  of  fighter  and  attack  aircraft.  To  develop  the  CERs, 
overall  aircraft  characteristics  are  used,  and  this  is  one  of  the  main  features  of  the 
model.  CERs  used  in  the  model  are  based  on  log-linear  regressions.  Total  production 
quantity  of  400  units  is  selected  as  the  quantity  to  obtain  the  costs  for  the  regression. 
The  major  explanatory  variables  are  DCPR  (Defense  Contractor's  Planning  Report) 
weight,  thrust,  DCPR  weight,  maximum  speed  at  best  altitude  and  IOC  (Initial 
Operational  Capability)  date.  DCPR  weight  is  derived  from  empty  weight  by  use  of 
the  relationships  indicated  in  Table  3. 

Costs  are  provided  in  two  categories:  total  RDT&E  cost  and  cumulative  average 
flyaway  cost.  All  costs  used  in  the  IDA  model  are  in  FY  1985  TOA  (Total 
Obligational  Authority)  dollars.  A  cumulative  average  learning  curve  slope  of  0.92  is 
used  to  adjust  the  aircraft  cost  data  [Ref.  5:  p.5]. 

TABLE  3 
SELECTED  CERS  FROM  IDA  MODEL 

RD  =  2.18  •  10'6  •  DCPR20493  •  (THRUST  DCPR)1-7  •  (1.0239)IOC-78 

FLY  =  0.194  •  (DCPR/I000)0-963  •  (SP  100)0-760  •  (1.034)10078 

where: 

RD  =  total  RDT&E  cost  (millions) 

FLY  =  cumulative  average  flyaway  cost  of  400  aircraft  (millions) 

THRUST  =  total  maximum  thrust  at  sea  level  (lb) 

SP  =  maximum  speed  at  best  altitude  (kts) 

IOC  =  initial  operational  capability  date  (last  two  digits  of  calendar  year) 

DCPR  =  aircraft  Defense  Contractor's  Planning  Report  weight  (lb) 

DCPR  =  0.0913  •  (EW)1,177     for   EW  >  50000 

DCPR  =  0.246 -(EW)1-096     for    10000  <  EW  <  50000 

DCPR  =   13.26 -(EW)0-674     for   EW  <   10000 

EW  =  aircraft  empty  weight  (lb) 
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Table  4  shows  the  summarized  characteristics  of  the  three  models.  Since  each 
model  has  its  own  purpose,  the  characteristics  are  different  for  each  model.  However, 
it  is  very  interesting  that  the  predicted  costs  from  each  model  are  fairly  similar.  Table 
5  compares  the  predicted  F-16  costs  for  these  three  models. 

TABLE  4 
THE  CHARACTERISTICS  OF  THREE  MODELS 


MODEL 


DAPCA-III 


Large 


IDA 


published 
year 

sampled 
aircraft 

sample 
size 

costs  of 
CERs 

major 
variables 


baseline 
quantity 

units  of 
cost 


1976 
several  types 
25 

subsystem 

weight,  speed, 
time  of  first 
flight,  dummy 

cumulative  200 

1975  dollars 


1977 

fighter  only 

17,  31 

subsystem 

weight<  speed, 
specific  power 

cumulative  100 
1973  dollars 


1984 
fighter,  attack 
7,  14  ■ 

overall  system 

weight,  speed, 
thrust- weight 
ratio,  IOC  date 

cumulative  400 

1985  dollars 


The  predicted  costs  of  DAPCA-III  and  Large  models  came  from  summing  up  all 
of  their  subsystem  costs.  The  last  page  of  Large  provides  a  good  comparison  between 
the  DAPCA-III  model  and  Large  model  of  F-16  cost  estimates  for  100  aircraft.  The 
estimates  range  from  S.867  to  10.356  million  dollars,  with  the  total  flyaway  cost  by  the 
IDA  model  being  9.401  million  dollars.  The  actual  total  flyaway  cost  of  an  F-16  for 
100  aircraft  is  9.641  million  dollars  according  to  the  "US  Military  Aircraft  Cost 
Handbook"  [Ref.  6:  p.IV-337].  So,  the  predicted  cost  from  the  IDA  model  is  a  better 
prediction  than  the  costs  given  by  the  other  models.  There  may  be  several  reasons  for 
this  result.  One  of  them  is  that  the  DAPCA-III  and  Large  models  were  published 
earlier  than  the  IDA  model.  Also,  it  may  be  that  CERs  based  on  the  overall  aircraft 
system  are  better  than  CERs  based  on  the  subsvstems. 
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TABLE  5 
COMPARISON  OF  PREDICTED  F-16  COSTS  FOR  THREE  MODELS 


DAPCA-III  model 

Large  model 

IDA  model 

with   without 
time    time 

fighter  only    31- 

several 
with   with    types 
power   speed    aircraft 

RDT&E     flyaway 
cost     cost 

9.839   10.356 

10.004   8.867    10.232 

1293.082    9.401 

note  : 

1.  Costs  are  based  on  the  total  production  quantity  of  100  units 

2.  All  costs  are  in  constant  19S1  dollars  (millions) 

3.  For  price-level  adjustments,  price  indices  in  Appendix  B  were  used 

4.  Actual  cost  of  an  F-16A  is  9.641  million  dollars  [Ref.  6:  p.IV-337] 
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III.  DATA  COLLECTION  AND  ADJUSTMENTS 

CERs  are  generally  obtained  from  the  statistical  analysis  of  historical  data.  Data 
must  be  collected  in  order  to  develop  CERs  and  then  adjusted  for  validity  and 
reliability.  Acquisition  of  data  is  the  process  of  identifying,  searching  out,  obtaining, 
verifying,  and  recording  the  specific  information  that  is  of  value  to  the  analyst. 

The  initial  step  in  developing  CERs  is  identifying  the  aircraft  of  interest  from  the 
many  types  of  aircraft  such  as  fighters,  bombers,  cargo  carriers,  reconnaissance  aircraft, 
helicopters,  etc.  However,  this  thesis  presents  the  CERs  for  fighter  aircraft  only.  The 
aircraft  data  used  has  been  collected  and  adjusted  from  unclassified  sources. 

A.       DATA  COLLECTION 

Developing  reliable  CERs,  especially  for  a  military'  application,  is  very  difficult  at 
best.  Consequently  there  are  many  problems  with  the  CERs  used  for  military 
hardware.  The  most  significant  problem  with  data  collection  on  a  military  system  is  to 
obtain  complete  information  from  unclassified  documents.  This  has  led  to  data 
anomalies  in  weapon  system  cost  estimation. 

Early  data  have  not  been  systematically  processed  and  stored  which  makes  the 
historical  information  of  little  value.  In  an  attempt  to  alleviate  this  data  collection 
problem,  the  Contractor  Information  Report  (CIR)  Program  was  established  by  the 
Department  of  Defense  (DOD)  in  1966.  This  reporting  system  was  designed  to  collect 
costs  and  related  data  on  major  contracts  for  aircraft  and  missile  and  space  programs. 
The  CIR  was  enlarged  to  cover  the  other  areas  of  defense  contracting  with  the 
implementation  of  the  Contractor  Cost  Data  Reporting  System  (CCDR).  The  CCDR 
collects  contractor  costs  and  related  data  needed  to  satisfy  cost  estimating 
requirements.  In  recent  years.  The  Analytical  Science  Corporation  (TASC).  with  the 
assistance  of  Management  Consulting  and  Research,  Inc.  (MCR).  has  been  compiling 
data  and  analyzing  the  cost  versus  the  effectiveness  of  tactical  aircraft  produced  since 
1950. 

While  collecting  data,  the  levels  of  accuracy  and  aggregation  should  be 
considered  in  order  to  develop  new  CERs.  There  are  two  basic  categories  of  data: 
aircraft  physical  and  performance  parameters  and  cost.  The  sample  for  this  thesis 
consisted  of  the  following  aircraft: 
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F-4E 

F-14A 

F-S6F 

F-104C 

F-6A 

F-15A 

F-S9D 

F-105D 

F-8E 

F-16A 

F-100D 

F-106A 

F-9F 

F-18A 

F-101B 

F-111A 

F-11A 

F-84F 

F-102A 

The  model  developed  in  this  thesis  is  based  on  this  sample  of  nineteen  U.S. 
fighter  aircraft.  Since  the  purpose  of  this  thesis  is  to  provide  fighter-based  CERs,  only 
fighter  aircraft  data  were  collected.  The  parametric  data  for  fighter  aircraft  were 
obtained  from  references  7  to  11;  however,  Jane's  All  the  World's  Aircraft  was  used 
primarily.  Most  of  the  earlier  CERs  were  out  of  date  in  that  they  did  not  include 
aircraft  introduced  into  the  armed  forces  in  the  1970's  and  19S0's,  such  as  the  F- 14, 
F-15,  F-16  and  F-18.  However,  the  data  used  in  this  thesis  includes  the  newest  fighter 
aircraft.  In  order  to  obtain  reliable  CERs,  all  the  aircraft  included  in  this  thesis  had 
initial  flight  dates  following  1950.  Only  one  aircraft  has  been  selected  from  each  design 
of  fighter  aircraft  in  order  to  decrease  potential  multicollinearity  in  the  data  sample. 

The  cost  data  were  obtained  from  the  "US  Military  Aircraft  Cost  Handbook" 
[Ref.  6].  They  are  based  on  a  cumulative  total  production  quantity  of  100  units,  so  the 
costs  presented  in  Appendix  A  are  the  cumulative  average  total  flyaway  costs.  All 
costs  used  in  this  thesis  are  in  constant  1981  dollars. 

The  following  definitions  were  developed  and  used  as  a  basis  for  determining 
what  adjustments  would  have  to  be  made  to  the  data.   They  are: 

1)  Weight  :  maximum  take-off  gross  weight  (lb) 

2)  Thrust  :  total  maximum  engine  thrust  (lb) 

3)  Speed  :  maximum  speed  at  best  altitude  (kts) 

4)  Year  :  year  of  initial  operational  capability 

5)  Cost  :  cumulative  Average  Costs  (CAC)  of  100  units  for  total  flyaway  cost  in 
constant  1981  dollars  (millions) 

Like  the  IDA  model,  overall  aircraft  characteristics  are  used  in  order  to  estimate 
the  fighter  aircraft  costs.  The  major  variables  for  airframe  cost  are  maximum  speed  at 
best  altitude,  maximum  take-off  gross  weight  and  initial  operational  capability  year. 
Some  other  variables  relating  to  aircraft  characteristics  (e.g.,  wing  span,  maximum 
thrust,  thrust-weight  ratio,  etc.)  were  tried  and  evaluated  but  generally  were  found  not 
to  be  significant.   Appendix  A  shows  the  total  data  base  used  in  this  thesis. 
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B.       DATA  ADJUSTMENT 

The  distortion  of  the  sample  observations  used  in  generating  CERs  is  another 
significant  problem  encountered  with  military  hardware.  The  major  distortion  occuring 
is  data  normalization.  Information  collected  and  reported  should  be  adjusted  using 
standardized  procedures  such  as  provided  by  the  Cost  Accounting  Standards  Board 
which  establishes  consistency  in  accounting  practices  among  government  contractors. 
Standardization  has  an  important  effect  upon  the  ability  of  DOD  contracting  personnel 
to  evaluate  proposals  and  better  determine  allocation  and  allowability  of  costs. 
Additionally,  when  using  data  for  different  purposes,  it  is  necessary  to  make  different 
adjustments  in  the  data.  The  two  most  common  adjustments  are  price-level  and  cost- 
quantity  adjustments. 

1.  Price-Level  Adjustments 

In  order  to  compare  the  cost  of  an  old  system  to  the  cost  of  a  new  system,  the 
cost  figures  must  be  adjusted  to  constant  dollars.  Adjustments  are  made  by  means  of  a 
price  index  constructed  from  a  time-series  of  data  in  which  one  year  is  selected  as  the 
base  and  the  value  for  that  year  expressed  as  100.  The  other  years  are  then  expressed 
as  percentages  of  this  base. 

Total  Obligational  Authority  (TOA)  dollars  in  a  year  (then-year  dollars)  are 
the  amounts  budgeted  in  a  specific  fiscal  year.  The  conversion  of  TOA  dollars  to 
constant  dollars  is  accomplished  by  dividing  TOA  by  a  composite  index  [Ref.  6: 
p.III-5].    Mathematically,  the  relationship  can  be  expressed  as 

TOA 

Constant  Dollars  =  x    100 

composite  index 

Appendix  B  shows  the  deflators  index  and  composite  indices  used  by  the 
military  services  (e.g.,  Army,  Navy,  Air  Force).  The  composite  indices  are  based  on 
the  Office  of  the  Assistant  Secretary  of  Defense  (OASD),  Comptroller,  deflator  for 
major  commodity  procurement  and  service  outlay  profiles.  The  tables  are  based  on 
Fiscal  Year  (FY)  1981  and  all  index  numbers  are  related  to  FY81  constant  dollars.  So 
the  composite  indices  are  used  to  normalize  aircraft  procurement  costs  of  the  respective 
services  into  FY81  constant  dollars.  Multiplication  by  100  is  required  since  the  index  is 
expressed  as  a  percentage. 
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As  an  example  of  price-level  adjustment,  calculating  the  total  cost  of  the  F- 16 
is  represented.  According  to  the  Large  model,  the  total  cost  of  an  F-16  from  the 
fighter  sample  using  specific  power  is  4.84  million  in  constant  1973  dollars.  The 
composite  index  of  1973  is  48.38  [Ref.  4:  p.  15].  Therefore,  we  can  calculate  the 
constant  1981  dollars  from  the  values,  that  is 

4.84 
Constant  1981  Dollars  =  x    100 

48.38 

=   10.004    (millions) 

2.  Cost-Quantity  Adjustments 

Learning  curves,  as  cost-quantity  relationships,  are  used  in  order  to  develop 
consistent  measures  of  costs.  The  basis  of  learning  curve  theory  is  that  each  time  the 
total  quantity  of  items  produced  doubles,  the  cost  per  item  is  reduced  to  a  constant 
percentage  of  its  previous  cost.  So  if  the  average  cost  of  producing  all  200  units  is  90 
percent  of  the  average  cost  of  producing  the  first  100  units,  the  process  follows  a  90 
percent  cumulative  average  learning  curve. 

The  cost-quantity  relationships  are  represented  using  regression  analysis 
techniques  assuming  the  following  functional  form: 

Cn  =  Cj  •  nb       or       ln(Cn)  =  InCCj)  +  b  •  ln(n) 

where: 

In  =  the  natural  logarithm  function 

C     =  cumulative  average  cost  for  quantity  n 

n  =  cumulative  production  quantity 

C,  =  the  cost  of  the  first  unit  produced 

b  =  the  exponent  related  to  the  slope  of  the  learning  curve 

The  slope,  S,  is  related  to  b  as 

ln(S) 


S  =  2b         or        b  = 


ln(2) 


where: 

S  =  slope  expressed  as  a  decimal 
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Therefore,    the    coefficient    b    means    that    when    cumulative    production    doubles, 
cumulative  average  costs  decrease  by  S  percent. 

As  an  example  of  cost-quantity  adjustment,  calculating  the  total  flyaway  cost 
of  the  F-6A  is  represented.  The  cumulative  average  cost  of  230  aircraft  is  3.584  million 
dollars  and  408  aircraft  is  3.051  million  dollars  [Ref.  6:  p.IV-278].  So.  based  upon  these 
two  points  the  learning  curve  slope  can  be  plotted  at  about  0.84.  As  implied  earlier, 
the  equation  which  calculates  the  cost  of  n  aircraft  from  the  cost  of  the  first  unit 
produced  is  expressed  as 


Therefore, 


Thus, 


c    = 

n 

Cj  -nb 

where 

b    = 

ln(S) 
ln(2) 

b    = 

ln(0.84) 
ln(2) 

= 

-0.25154 

^230 

=  C,  •  230-0-25154 

=  Ct  •  (0.25464) 

Cl      - 

3.584 

n  ")viA.i 

From  this  value  it  is  possible  to  calculate  the  cumulative  average  cost  of  100  aircraft  of 
the  F-6A.   The  cost  is 

3-584      t  0.25154 

100  0.25464 

=  4.419    (million  dollars) 

The  costs  used  in  this  thesis  are  Cumulative  Average  Cost  (CAC)  for  quantity 
of  100  units.    Each  fighter  has  a  different  learning  curve  with  a  unique  slope. 
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IV.  STATISTICAL  APPROACH 

CERs  are  developed  from  the  historical  cost  of  systems  and  the  explanatory 
variables  of  those  systems.  Therefore,  some  variables  which  are  logically  and 
theoretically  related  to  cost  have  to  be  selected  in  order  to  develop  reliable  CERs.  An 
important  characteristic  of  reliable  CERs  is  that  the  relationship  between  cost  and 
explanatory  variables  must  be  direct  and  obvious. 

Regression  analysis  can  be  applied  as  a  statistical  technique  to  develop  CERs 
from  the  historical  cost  and  parametric  data.  Regression  analysis  is  primarily 
concerned  with  the  determination  of  the  equation  of  a  line  or  curve  which  will  predict 
how  the  dependent  variable  will  van'  with  respect  to  some  independent  variables. 
Therefore,  regression  analysis  will  estimate  the  coefficients  of  the  equation,  (e.g., 
intercept  and  slopes)  and  infer  the  reliability  and  significance  of  the  results  of  the 
estimate.  (Johnston's  Econometric  Methods  [Ref.  12]  is  the  source  of  all  facts  and 
derivations  shown  in  this  chapter.) 

Generally,  there  are  two  types  of  linear  regression  models,  simple  and  multiple. 
The  difference  between  these  two  models  is  the  number  of  variables  in  the  equation. 
The  simple  linear  regression  model  has  only  two  variables,  while  the  multiple  linear 
regression  model  has  more  than  two  variables. 

A.       SIMPLE  LINEAR  REGRESSION 

The  equation  used  in  simple  linear  regression  has  two  variables,  cost  and  an 
explanatory  variable.  This  means  that  the  cost  is  expressed  as  a  linear  function  of  an 
explanatory  variable.  Thus,  as  an  example  of  the  simple  linear  regression  model,  the 
linear  relationship  is 

y  =  a  +  px  +  u 

where: 

y  =  the  dependent  (cost)  variable 

x  =  the  independent  (explanatory)  variable 

a  =  the  intercept  of  the  line 

P  =  the  slope  of  the  line 

u  =  error  term  between  the  actual  cost  and  expected  cost  of  y 
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Additionally,  the  log-linear  regression  model  is  very  frequently  used  as  another 
method  of  expressing  the  linear  model.  The  log-linear  equation  results  from  taking 
logarithms  of  both  sides  of  the  linear  equation,  and  is  written  as 

y  =  e«  .  XP  .  eu       or      ln(y)  =  a  +  p  .  in(x)  +  u 

Thus,  this  equation  graphs  as  a  linear  relationship  when  plotted  in  terms  of  ln(x)  and 
ln(y). 

There  are  some  assumptions  made  with  regards  to  the  error  term.  The  first 
assumption  is  that  the  error  term  is  normally  distributed  with  zero  mean  and  variance 
G~,  that  is 

u  -  \'(0,  <y2) 

The  second  assumption  is  that  the  error  term  for  different  x  values  are  independent  and 
identically  distributed. 

1.  Least-Squares  Estimation 

As  implied  earlier,  the  simple  linear  regression  model  has  some  unknown 
parameters:  a,  p.  and  (7~.  Those  unknown  parameters  have  to  be  estimated  in  order  to 
establish  CERs.  The  least-squares  is  the  most  frequently  used  method  for  estimating 
the  unknown  parameters. 

By  using  the  simple  linear  regression  model,  the  actual  cost  of  the  system  is 
indicated  by 

y{  =  a  +  pXi  +  Uj 

where  y:  is  the  actual  cost  of  the  ith  observation.  Then,  any  straight  line  drawn 
through  the  scatter  of  data  points  may  be  regarded  as  an  estimate  of  the  hypothesized 
relationship  y  =  a  +  Px  +  u.   A  straight  line  is  indicated  by 

y    =  a  +  bx 

where  y    indicates  the  value  of  the  line  at  any  given  value  of  x. 

The  principle  of  the  least-squares  is  that  the  unknown  parameters  are  selected 
to  minimize  the  sum  of  squared  residuals.   This  minimization  is  expressed  as 

min  Le- 

Under  this  principle,  the  unknown  parameters  are  determined  as 
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a  =  Y  -  bX 

S^-XXyj-Y) 

£(xrx)2 

The  difTerence  between  the  actual  cost  and  the  expected  cost  is  defined  as  the  residuals 
which  is  written  as 

ei  =  vi  ~  >'f  =  >'i  "  (a  +  bxj) 

where  e^  is  the  residual  of  the  ith  observation.    Also,  under  the  minimization  principle, 
the  unbiased  estimator  for  cr~  is  determined  as 


Ie;2 

s-  = 


.->         — 1 


n-2 


The  following  are  some  properties  of  the  least-squares.  First,  the  expected 
values  of  the  parameters  a  and  b  are  exactly  same  as  the  values  of  a  and  p.  It  is 
indicated  by 

E[a]  =  a       and       E[b]  =  p 

Thus,  a  and  b,  as  the  least-squares  estimators  in  a  simple  linear  regression  model,  are 
unbiased  estimators  for  a  and  p.  Secondly,  the  least-squares  estimators  have  the 
minimum  variances  among  all  linear  unbiased  estimators.  As  a  result,  the  least-squares 
estimators  for  a  and  b  are  called  the  best  linear  unbiased  estimators  [Ref.  13:  p. 473]. 
The  minimum  variances  property  is  the  major  reason  why  least-squares  is  so  frequently 
employed  in  estimating  unknown  parameters. 

By  using  the  least-squares,  some  simple  linear  regression  models  are  obtained. 
Then,  the  log-linear  function  can  be  selected  as  the  best  simple  linear  model.  An 
example  is 

C  =  0.172  •  TL23°       or       ln(C)  =  ln(0.172)  +  1.230  •  ln(T) 
and  rewrite  the  model  as 

C  =    -1.760  +   1.230-  T' 

where: 

C  =  total  flyaway  cost  of  fighter  aircraft  in  constant  1981  dollars  (millions) 

T  =  total  maximum  engine  thrust  (lb) 
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2.  The  Correlation  Coefficient 

The  selected  model  must  be  examined  in  order  to  determine  the  reliability  or 
accuracy  of  that  equation.  There  are  several  statistical  measures  that  can  indicate  the 
goodness  of  fit  of  the  equation  in  describing  data.  R"  is  the  most  commonly  used 
measure  of  the  goodness  of  fit  and  is  defined  as  the  coefficient  of  determination  which 
comes  from  squares  of  the  correlation  coefficient  (R).  The  computing  of  R2  is  as 
follows: 


R2    = 


Explained  sum  of  squares 
Total  sum  of  squares 


L(yf-Y)2 


=   1   - 


L(y-Y)- 

Residual  sum  of  squares 


Total  sum  of  squares 


v   2 

Le 


(y-Y)2 

R~  is  the  proportion  of  the  total  deviation  which  can  be  explained  by  the  regression 
model,  and  corresponds  to  all  data  points  which  lie  on  the  regression  line.  The  highest 
possible  value  of  R2  is  1.00  and  the  lowest  is  0.00. 

The  value  of  R~  from  the  log-linear  regression  model  above  is 

R2  =  0.7007 

which  is  a  relatively  low  value.  It  means  that  thrust  alone  does  not  explain  all  of  the 
variance  in  the  cost  data.  It  also  means  that  the  log-linear  model  above  does  not  fit 
the  data  well.  Usually,  there  exists  two  ways  to  increase  R"  to  a  relatively  high  value. 
They  are: 

1)  To  add  some  other  variables  into  that  equation.  Adding  variables  may  explain 
the  remaining  variance.   This  will  be  discussed  in  Section  B  below. 

2)  To  find  other  equations.  If  the  simple  linear  regression  model  does  not  fit  the 
data  well,  then  multiple  linear  regression  models  with  other  variables  may  fit 
the  data  better. 
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3.  Statistical  Inference 

As    implied    earlier,    the    hypothesized    relationship    between    the    dependent 
variable  (y)  and  independent  variable  (x)  may  be  indicated  by 

y  =  a  +  px  +  u 

where  u  is  an  error  term.  Under  this  relationship,  the  least-squares  method  produces 
unbiased  estimators  a  and  b.   Thus,  the  outcomes  of  a  least-squares  regression  line  is 

V- '  =  a  +  bx 
'  1 

Standard  statistical  techniques  can  be  applied  to  the  least-squares  result  to  test  for 
significance  and  to  make  inferences  about  reliability  and  accuracy  in  a  probabilistic 
sense. 

a.   t-test 

It  is  necessary  to  test  the  relationship  between  y  and  x.  This  is  done  by 
establishing  the  null  hypothesis  that  y  and  x  are  not  related  to  each  other,  and  the 
alternative  hypothesis  that  y  and  x  are  related  to  each  other: 

HQ :     P  =  0 

Hj  :     P  *0 

These  hypotheses  are  the  most  frequently  used,  and  are  referred  to  as  testing  the 
significance  of  x.    By  a  similar  development,  tests  on  the  intercept  are 

HQ :     a  =  0 

Hj  :    a  x  0 

The  test  that  is  commonly  used  for  this  purpose  is  known  as  the  t-test 
because  the  tests  on  the  a  and  P  are  based  on  the  t  distribution.    It  follows  then  that: 


ta  = 

a  -  a 

S 

a 

tb   = 

b  -  p 

t(n-2) 


-     t(n-2) 
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where: 


S    =  the  standard  error  of  a 
a 


=  s- L- 


J  n-LUi-X)2 


S.    =  the  standard  error  of  b 


y^(xrx)2 

S    =  the  standard  error  of  regression 


v  2 
— e 


V  n  -  2 

If  the  sample  t  statistic  is  numerically  greater  than  the  preselected  critical 
value  of  t,  we  accept  the  alternative  hypothesis  and  conclude  that  x  plays  a  significant 
role  in  the  determination  of  y.  The  following  values  result  from  the  least-squares 
regression  line  based  on  the  data  in  Appendix  A.   They  are 

a  =    -1.760444 
b  =  1.230396 
S^  =  0.586718 

a. 

Sb  =  0.195015 
Since  n=  19,  from  the  t  distribution  with  17  degrees  of  freedom. 

t0025(17)=  2.110 
Thus,  the  intercept  is  significantly  different  from  zero  since 

t,  =  |  -3.000  |  =  3.000  >  2.110 

a  ■ 

Also,  the  slope  is  significant  since 


tb  =  6.309  >  2.110 


b.    Confidence  Interval 

Examining  the  confidence  intervals  for  a  and  P  is  another  way  to  test  the 
significance  of  the  unbiased  estimators  a  and  b.  Since  a  confidence  interval  which 
includes  zero  is  equivalent  to  accepting  the  null  hypothesis  that  the  true  value  of  the 
parameter  is  zero,  an  interval  which  does  not  include  zero  is  equivalent  to  rejecting  the 
null  hypothesis. 

Generally,  100(1  —  p)  percent  confidence  intervals  for  a  and  P  are  indicated 
by 

CI(a)  =  a  ±  tp/2  •  Sa 


CI(P)  =  b  ±  tp  2  •  Sb 


where  S    and  S,  are  the  standard  errors  of  a  and  b. 

a  b 

A  95  percent  confidence  interval  for  a  is  then 

CI(a)  =    -1.760  ±  (2.110x0.587) 
or  -0.522     to     -2.998 

Also,  a  95  percent  confidence  interval  for  P  is 

CI(p)  =   1.230  ±  (2.110XQ.195) 

or  0.1S9     to     1.642 

Therefore,  the  fact  that  the  confidence  intervals  for  a  and  P  do  not  include  zero  means 
that  the  null  hypotheses  are  rejected,  and  the  unbiased  estimators  a  and  b  are 
statistically  significant. 

c.   F-test 

The  analysis  of  variance  (ANOVA)  test  is  merely  a  significance  test  on  P 
performed  in  another  way,  and  is  referred  to  as  the  F-test.  The  F  statistic  is  the  ratio 
of  the  mean  square  due  to  x  over  the  residual  mean  square.    Thus,  it  is  indicated  by 

b2-L(x-X)2 

F-  -=rr. — -    ~   F(l.n-2) 

Le*/(n-2) 
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The  significance  of  x  is  thus  tested  by  examing  whether  the  sample  F  exceeds  the 
appropriate  critical  value  of  F  taken  from  the  upper  tail  of  the  F  distribution. 
Therefore,  the  test  procedure  is  that  if  the  value  of  F  is  greater  than  the  value  of  F(l, 
n— 2),  then  reject  HQ  :    p  =  0. 

Usually,  the  F-test  will  be  applied  extensively  in  multiple  linear  regression 
models.  However,  in  simple  linear  regression  models,  the  F  variable  with  (l.k)  degrees 
of  freedom  is  the  square  oi^  a  t  value  with  k  degrees  of  freedom.  The  relationship 
between  the  t  and  F  distributions  can  be  explained  with  the  correlation  coefficient.  R~. 
It  is 


t  = 


R->/(n~ 


!) 


yr^R2 

R2  1 

-  r2 

(l-R2)(n-2) 

i 

F  = 


The    ANOVA    for    the    least-squares    regression    line    based    on    the    19 
observations  in  Appendix  A  is  as  follows: 


Source 

Degrees  of 
freedom 

Sum  of 
square 

Mean 
square 

Thrust 

1 

11. 238766 

11. 238766 

Residual 

17 

4.  799673 

0. 282334 

Total 

18 

16. 038440 

Since  n=  19,  using  the  F  distribution  with  1  and  17  degrees  of  freedom, 

F095(l,17)  =  4.451 
The  sample  F  statistic  is 

11.239 


F  = 


0.2S2 


=  39.807  >  4.451 


Thus,  HQ  :   P  =  0   rejected.    It  means  that  the  intercept  is  not  zero. 
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B.       MULTIPLE  LINEAR  REGRESSION 

In  the  previous  section,  the  linear  relationship  between  cost  and  thrust  was 
examined  as  a  simple  linear  regression  model.  It  was  selected  as  the  best  model  using 
two  variables,  and  the  relationship  was  represented  with  log-linear  function.  However, 
its  low  R~  means  that  using  a  model  with  only  one  independent  variable,  thrust,  cannot 
fit  the  situation  well.  Therefore,  some  other  models  which  have  more  than  one 
independent  variables  have  to  be  examined. 

Multiple  linear  regression  models  have  more  than  one  independent  variable. 
Thus,  the  vector  of  sample  observations  on  the  dependent  variable  (Y),  may  be 
expressed  as  a  linear  combination  of  the  sample  observations  on  the  independent 
variables  (X)  and  the  vector  of  the  error  term  (u).  An  example  of  the  hypothesized 
multiple  linear  regression  model  is  represented  as 

Y  =  pjXj  +  P2X2  +  •  •  •  +  PkXk  4-  u 

where: 

Y  =  the  vector  for  dependent  (cost)  variable 
X,  =  the  unit  vector  for  an  intercept 

X.  =  the  vector  for  independent  variables  (i  *  1) 
P-    =  unknown  parameters 
u    =  the  vector  of  error  terms 

Each  vector  is  a  column  vector  of  n  elements.    The  multiple  linear  regression  model 
may  also  be  expressed  in  matrix  form  as 

Y  =  XP  +  u 

where  Y  and  u  are  n  x  1  matrices,  X  is  a  n  x  k  matrix,  and  P  is  a  k  x  1  matrix. 

Like  the  simple  linear  regression  model,  there  are  some  assumptions  made  for  the 
multiple  linear  regression  model.   They  are: 

1)  The  u  vector  has  a  multivariate  normal  distribution,  with  each  u  distribution 
having  a  zero  mean  vector  and  the  same  variance  vector  (cr).   That  is 

u  ~  N(0,  d2I) 

where  I  is  the  identity  matrix. 

2)  X  is  a  nonstochastic  matrix  and  its  rank  is  k.    That  is 

P(X)  =  k 
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1.  Ordinary  Least-Squares  (OLS)  Estimation 

As  implied  earlier,  the  hypothesized  multiple  linear  regression  model  and  a 
vector  of  the  straight  line  are  indicated  by 

Y  =  XP  +  u 

Y*  =  Xb 

where  b  is  k  element  vector.   Thus,  a  vector  of  errors  or  residuals  can  be  defined  as 

e  =  Y  -  Xb 

The  principle  of  the  least-squares  is  that  b  is  selected  to  minimize  the  sum  of 
the  squared  residuals,  e'e.    Under  this  principle,  b  is  determined  as 

b    =  (X'X)_1X'Y 

=  p  +  (X'Xj-'X'u 

Then  the  variance-covariance  matrix  of  the  OLS  estimators  is 

var(b)  =  <y2(X'X)_1 

where  the  elements  on  the  main  diagonal  of  this  matrix  give  the  sampling  variances  of 
the  corresponding  elements  of  b,  and  the  off-diagonal  terms  give  the  sampling 
covariances. 

Since  the  expected  value  of  b  is  exactly  the  same  as  the  value  o[  p,  the  OLS 
estimators  are  linear  unbiased  estimators.   This  is  indicated  by 

E[b]  =  P 

Also,  since  the  OLS  estimators  have  the  minimum  sampling  variances  among  all  of  the 
linear  unbiased  estimators,  b  is  the  best  linear  unbiased  estimator  (b.l.u.e).  Using  the 
OLS,  two  equations  are  selected  as  the  best  models.   They  are 

C19  =    -701.635  +  0.215W  +  0.358Y 
C6    =    -3994.618  +  0.68SW    +  2.013Y 

where: 

W  =  (maximum  take-off  gross  weight)  T000 

Y   =  year  of  initial  operational  capability 
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The  former  model  is  based  on  the  19  data  points  in  Appendix  A,  while  the 
latter  is  based  on  only  6  data  points.  However,  the  6  data  points  used  in  the  latter 
model  have  an  initial  operational  capability  year  of  1965  or  after.  It  means  that  the 
latter  model  is  based  on  the  relatively  new  aircraft  data.  Thus,  the  6  data  points 
contained  in  the  latter  model  are 

F-4E  F-14A  F-15A 

F-16A  F-18A  F-111A 

2.  The  Correlation  Coefficient 

The  correlation  coefficient  is  the  most  commonly  used  measure  of  the 
goodness  of  fit.  Then,  the  multiple  correlation  coefficient  for  the  k-variable  is  defined 
as 

2         Explained  sum  of  squares 


=   1  - 


Total  sum  of  squares 
Residual  sum  of  squares 


Total  sum  of  squares 


e  e 
=   1  - 


Y'AY 


where: 

A   =  I  -  (1  n)u' 

I     =  identity  matrix 

i     =  a  column  vector  of  n  units 

Y'AY  =  the  sum  of  squared  deviations  in  Y 

The  value  of  R2  from  the  selected  multiple  regression  model  based  on  the  19 
observations  is 

R219  =  0.7504 

Although  this  is  a  slightly  higher  value  than  that  of  the  simple  regression  model,  the 
value  is  still  relatively  low.  It  means  that  the  weight  and  year  do  not  explain  all  of  the 
variance  in  the  cost  data,  and  the  model  does  not  fit  the  data  well. 


33 


However,  the  value  of  R~  from  the  selected  multiple  regression  model  based 
on  the  6  observations  is 

r2&  =  0.9441 

This  is  a  relatively  high  and  good  value,  thus  the  weight  and  year  variables  fit  the  6 
data  points. 

The  value  of  R~  adjusted  for  degrees  of  freedom  is  useful  when  comparing 
different  numbers  of  independent  variables,  and  is  referred  as  the  adjusted  R".  The 
adjusted  R"  is  defined  as 

,  e'e  (n-k) 

R2  =  1  - 


Y'AY(n-l) 

Thus,  the  values  of  adjusted  R    for  the  selected  simple  and  multiple  regression  models 
are 

R2     =  0.6831 

R219  =  0.7192 

R26    =  0.9068 

Therefore,  comparing  the  results  of  the  adjusted  R"  shows  that  they  are  almost  same  as 
those  of  R~. 

3.  Statistical  Inference 

The  characteristices   of  the  multiple   linear  regression  models  were  already 
mentioned  at  the  beginning  of  this  chapter.   According  to  them,  b  is  indicated  by 

b  -  N(P,  (T^X'X)-1) 

Then,  the  variance  of  error  term,  as  an  estimator  of  <T2,  is  defined  as 

-,            e'e 
S2  =  


and  S  is  a  standard  error  of  the  regression. 
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a.    t-test 

Since  b  is  the  estimated  coefficient  matrix  of  X,  b.  is  the  estimated 
coefficient  of  X.  in  the  OLS  regression,  b  is  distributed  independently  of  S2.  Thus  the 
t-test  of  the  multiple  linear  regression  is  determined  as 

b.  -  p. 

t  =  — I ^-     ~    t(n-k) 


a., 
u 


where  a;i  denotes  the  ith  element  on  the  principal  diagonal  of  (X'X)"1. 

Hypotheses  are  established  about  P,  where  the  null  hypothesis  is  HQ  :  p  = 
0  and  the  alternative  hypothesis  is  H}  :  P  *  0.  Then,  the  t  statistics  of  the  selected 
multiple  linear  models  are  as  follows: 


Model 

Based  on  19  obs. 

Based  on  6  obs. 

Intercept 

Weight 

Year 

-2. 879 
3.  535 
2.  864 

-6. 185 
7.  088 
6.  194 

If  n  =   19,  from  the  t-distribution  with  16  degrees  of  freedom, 

t0025(16)  =2.120 
and  if  n  =  6,  with  3  degrees  of  freedom, 

t0025(3)  =  3.182 

Thus,  since  all  of  the  t  statistics  based  on  the  selected  models  are  greater  than  their 
critical  values,  the  coefficients  are  not  zero. 

b.    Confidence  Interval 

The  100(1— p)  percent  confidence  intervals  for  the  coefficients  of  Weight 
(X,)  and  Year  (X,)  are  indicated  by 

CI(P2)  =  b2  ±  tp  2  •  s2 

CI(p3)=  b3  ±  tp2-s3 

where  S-,  and  S3  are  standard  errors  of  b2  and  b3. 
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The   following   values   are   indicated  in  the   least-squares  regression   lines 
based  on  the  data  in  Appendix  A.   They  are  as  follows: 


Model 

Based  on  19  obs. 

Based  on  6  obs. 

b2 

0.  215 

0.  688 

b3 

0.  358 

2.  013 

S2 

0.  061 

0.  097 

S3 

0.  125 

0.  325 

Thus,  the  95  percent  confidence  intervals  for  p.,  and  P^  based  on  the  19  observations 
are 

CI(P2)  =  0.215  ±  (2.120x0.061) 

0.086     to     0.343 


or 


or 


CI(p3)  =  0.358  ±  (2.120x0.125) 
0.093     to    0.623 


Also,  the  confidence  intervals  based  on  the  6  observations  are 
CI(p2)  =  0.688  ±  (3.182X0.097) 
0.379     to    0.997 


or 


or 


CI(p3)  =  2.013  ±  (3.182x0.325) 
0.979     to     3.047 


Therefore,  the  fact  that  all  of  the  confidence  intervals  do  not  include  zero  means  that 
b2  and  b3  based  on  the  19  and  6  observations  are  statistically  significant. 

c.   F-test 

The  t-test  is  usually  used  to  test  the  significance  of  a  single  coefficient. 
However,  when  added  to  the  function  of  the  t-test,  the  F-test  can  be  used  to  test  the 
significance  of  the  complete  regression  and  the  significance  of  a  subset  of  coefficients. 
Thus,  the  F-test  of  the  multiple  linear  regression  will  be  a  very  useful  and  powerful  tool 
for  testing  the  independent  variables.  X. 
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In  order  to  test  the  elements  of  p,  the  linear  hypothesis  is  established  as 

Rp  =  r 

where  R  is  qxk  matrix  with  rank  q,  and  r  is  a  q  element  vector.     I  herefore,  if  the 
linear  hypothesis  is  true,  the  following  is  obtained 

(Rb-r)  -  N(0,  ff^X'X^R') 

Thus,   the  F  statistic  under  the  linear  hypothesis  is 

Rb-D'IRfX'Xr'RT'iRb-r)  q 


F    = 


e'e  fn-  k) 

~  F(q,n  -  k) 

In  order  to  test  the  joint  significance  of  Weight  (X,)  and  Year(X3),  the  nul 
hypothesis  is  established  as 

H0:   p2  =  P3  =  0 

Then,  the  F  statistic  for  this  hypothesis  can  be  indicated  by 

Explained  sum  of  squares  (k  -  I ) 


Residual  sum  of  squares  (  n  -  k) 
(Y'AY-e'e)  (k-l) 


e'e  (n  -  k) 
Thus,  the  F  statistic  based  on  the  19  observations  is 

S  10.264  (3-1) 


F    = 


209.490  (19-3) 

=   24.053 

Since  n=  19,  from  the  F  distribution  with  2  and  16  degrees  of  freedom, 

F095(2,16)  =  3.634  -  24.053 

Therefore.  HQ  :    P2  =  P3  =  0  is  rejected.    It  means  that  even  though  the  sample  R2  is 
numerically  low,  the  model  is  significant. 
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Also,  the  F  statistic  based  on  the  6  observations  is 

300.204/(3-1) 
F    =  

17.773  (6-3) 

=  25.337 

Since  n=6,  from  the  F  distribution  with  2  and  3  degrees  of  freedom, 

F095(2,3)  =  9.552  <  25.337 

Thus.  HQ  :    p.,  =   P3  =  0  is  also  rejected.    This  model  is  therefore  significant  with  a 
numerically  high  R2. 
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V.  ANALYSIS  OF  THE  MODELS 

The  reliable  CERs  will  accurately  predict  the  costs  of  systems,  provided  they  are 
suitable  for  that  particular  system.  Thus,  in  order  to  establish  reliable  CERs,  the 
previous  chapter  demonstrated  use  of  regression  methods  on  the  simple  and  multiple 
linear  regression  models  performed  on  various  combination  of  the  explanatory 
variables  contained  in  Appendix  A.  Then,  some  models  were  selected  as  desirable  for 
predicting  costs  of  fighter  aircraft  using  least-squares  estimation.  However,  many 
alternative  models  were  discarded  because  of  statistical  problems.  Appendix  C 
illustrates  use  of  simple  and  multiple  linear  regression  models  for  various  combinations 
of  the  explanatory  variables. 

For  selecting  reliable  models,  approximately  1000  models  were  estimated. 
Models  were  evaluated  using  from  one  to  eight  explanatory  variables.  The  summary  of 
these  models  is  presented  below: 


19  observations 

6  observations 

1-3 
variables 

See  Appendix  C 

4-5 
variables 

Statistically  unsatisfactory 

6-8 
variables 

Statistically 
unsatisfactory 

No 
report 

Then,  in  order  to  check  how  the  models  fit  the  data,  the  selected  models  were 
evaluated  with  several  statistical  measures:  the  coefficient  of  determination  (R"),  the 
adjusted  coefficient  of  determination  (R"),  standard  error  (SE),  t  statistics  (t), 
confidence  intervals  (CI),  and  F  statistics  (F).  However,  since  no  single  statistic  can  be 
a  meaningful  indication  of  the  models'  applicability,  the  models'  statistics  must  be 
looked  at  together.  Table  6  shows  a  summary  of  the  cost  estimating  models  developed. 
The  table  includes  the  selected  equations,  the  results  of  the  statistical  measures,  and  the 
correlation  matrices  of  the  estimated  coefficients  in  order  to  aid  in  analyzing  the 
models. 
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TABLE  6 
SUMMARY  OF  COST  ESTIMATING  MODELS 

A.    Simple  linear  regression  model  based  on  19  observations 
ln(C)  -  ln(0.172)  +  1.230  •  ln(T) 


R2  =  0.6831 
t(b2)  =  6.309 


R-  =  0.7007 
t(bi)  =    -3.000 
CI(b2)  =  0.1S9   to    1.642 
F  =  39.807 

CORRELATION   MATRIX   OF   ESTIMATES 
INTERCEP  LPWR 


SE  =  0.531 


INTERCEP 
LPWR 


1. 0000 
-0. 9782 


-0. 9782 
1. 0000 


B.    Multiple  linear  regression  model  based  on  19  observations 
C  =    -701.635  +  0.215W  +  0.35SY 

R2  =  0.7504  R2  =  0.7192  SE  =  4.104 

t(bl)  =    -2.879  t(b2)  =  3.535  t(b3)  =  2.864 

CI(b2)  =  0.0S6   to   0.343  CI(b3)  =  0.093   to   0.623 

F  =  24.053 

CORRELATION   MATRIX   OF    ESTIMATES 

INTERCEP  WT  YEAR 


INTERCEP 

1. 0000 

0. 5662 

-1. 0000 

WT 

0. 5662 

1. 0000 

-0. 5731 

YEAR 

-1.  0000 

-0. 5731 

1.  0000 

C.    Multiple  linear  regression  model  based  on  6  observations 
C  =    -3994.618  +  0.6S8W  +  2.013Y 
R2  =  0.9441  R2  =  0.9068 

t(bl)  =    -6.185  t(b2)  =  7.088 

CI(b2)  =  0.379   to   0.997 
F  =  25.337 


SE  =  2.434 
t(b3)  =  6.194 
CI(b3)  =  0.979   to   3.047 


CORRELATION  MATRIX  OF  ESTIMATES 

INTERCEP        WT  YEAR 


INTERCEP 

1. 0000 

-0. 8239 

-1. 0000 

WT 

-0. 8239 

1.  0000 

0. 8209 

YEAR 

-1. 0000 

0. 8209 

1. 0000 
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Additionally,  characteristics  other  than  statistical  measures  should  be  considered 
in  analyzing  the  models.    Some  of  them  are: 

1)  The  signs  and  the  magnitudes.  Usually,  cost  is  expected  to  increase  with 
thrust  and  weight.  Additionally,  since  the  new  aircraft  contain  modular 
avionics  which  are  easily  updated  (e.g.,  radar,  electronic  equipments,  etc.),  the 
cost  of  the  new  aircraft  is  expected  to  increase  with  year  of  initial  operational 
capability.  Therefore,  the  developed  models  containing  the  positive 
coefficients  for  thrust,  weight,  and  year  make  sense. 

2)  The  constant  term.  The  developed  multiple  linear  regression  models  contained 
large  negative  constant  terms.  This  means  that  the  developed  multiple  linear 
regression  models  would  not  be  valid  over  the  full  range  of  possible  values  of 
the  independent  variables. 

3)  The  correlation  matrix.  The  correlation  matrices  are  included  in  the  table  to 
aid  in  determining  the  multicollinearities  that  may  exist  between  the  various 
independent  variables  in  the  models. 

Table  6  shows  that  all  of  the  t  statistics  are  greater  than  their  critical  values,  and 
the  confidence  intervals  do  not  include  zero.  This  means  that  all  of  the  unbiased 
estimators  of  the  developed  models  are  significantly  different  from  zero.  Furthermore, 
since  all  of  the  F  statistics  are  greater  than  their  critical  values,  the  developed  models 
are  significant. 

However,  the  multiple  linear  regression  model  based  on  6  observations  which  has 
an  initial  operational  capability  year  following  1965  contains  desirable  values  of  the 
coefficient  of  determination  (R2).  0.9441.  and  the  coefficient  of  determination  adjusted 
for  degrees  of  freedom  (R  ),  0.9068.  This  indicates  that  the  equation  based  on  6 
observations  fits  the  data  well  because  the  dependent  variables,  weight  and  year, 
explain  the  variance  in  the  cost  data. 

Also,  Table  6  shows  that  the  multiple  linear  regression  models  based  on  19 
observations  contains  a  large  value  of  standard  error  (SE)  which  is  a  measure  of  the 
dispersion  of  the  data  and  relates  to  the  prediction  intervals.  It  indicates  that  the 
multiple  linear  regression  model  based  on  19  observations  does  not  have  the  desirable 
prediction  intervals.  Therefore,  the  multiple  linear  regression  model  based  on  6 
observations  is  selected  as  a  desirable  fighter  aircraft  CER. 
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Initially,  the  data  base  contained  a  large  number  of  international  fighter  aircraft, 
but  many  observations  were  eliminated  because  of  insufficient  information.  As  such, 
only  19  observations  were  chosen.  Since  models  using  19  observations  were 
statistically  unsatisfactory,  a  small  subset  of  6  observations  was  selected  from  the 
original  19.  Then,  the  Chow  test  [Ref.  12:  p. 207-225]  was  performed  on  models  with  6 
observations  and  related  with  the  other  13  observations  (i.e..  comparisons  were  made 
to  determine  if  both  data  sets  came  from  the  same  population  of  fighter  aircraft). 
Appendix  D  shows  the  test  results  which  indicates  that  the  two  groups  of  data  are  not 
from  the  same  population.  The  6  observations  are  representative  of  current  fighter 
aircraft,  and  should  provide  the  best  estimates  of  future  fighter  aircraft  costs. 

Since  the  purpose  of  CERs  is  to  estimate  the  cost  of  systems,  by  substituting  the 
parameters  of  the  proposed  system  into  the  CERs,  it  will  be  possible  to  estimate  the 
cost  of  the  system.  There  are  two  kinds  of  prediction:  a  point  prediction  and  an 
interval  prediction.  If  the  obtained  equation  fits  the  data  well,  then  a  good  prediction 
will  be  possible.  However,  it  is  very  unlikely  that  the  point  prediction  will  be  realized. 
Therefore,  a  prediction  interval  should  be  constructed  in  order  to  describe  the 
uncertainty  of  the  estimates. 

Point  prediction  is  obtained  by  substituting  the  values  of  dependent  variables  into 
the  selected  equation.  As  implied  earlier,  the  selected  multiple  linear  regression  model 
based  on  6  observations  is 

C6    =    -3994.618  +  0.688W    +  2.013Y 

Thus,  since  the  value  of  weight  (W)  and  year  (Y)  of  the  F-16A  are  35.4  and  1978,  the 
selected  regression  equation  gives  the  point  estimate  of  an  F-16A  as  follows: 

C6    =    -3994.618  4-  0.688(35.4)    +  2.013(1978) 
=   11.917 

Also,  the  following  formula  is  used  to  construct  a  100(1  —  p)  percent  prediction 
interval  (PI)  for  the  point  estimate.  It  includes  the  standard  error  (SE)  and  indicates  as 
follows: 


PI  =  Yf*  ±  t    2  •  SE  •  J\  +  R(X'X)"1R' 

where  Yf    is  the  point  forecast,  X  is  the  matrix  of  data  base  with  the  first  column  of 
units,  and  R  is  the  vector  of  proposed  system's  parameters. 
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Therefore,  a  95  percent  prediction  interval  for  F-16A  based  on  6  observations  is 


Thus, 


Yf    =   11.917 


tQ025(3)  =  3.1S2 
SE  =  2.434 


(X'X)"1     = 


70402.8289  -8.7208  -35.4297 
-8.7208   0.0016    0.0044 


\ 


-35.4297      0.0044        0.0178 


RfX'X)"1!*'  =  0.519 


or 


PI     =   11.917  ±  2.120  -(2.434)  •  JTJl9 
=   11.917  ±  6.360 

5.557     to     18.277 


Up  to  this  point  we  have  seen  some  reasons  to  believe  that  the  multiple  linear 
regression  model  based  on  6  observations  will  give  a  better  estimate  of  fighter  aircraft 
than  more  broadly  based  models.  However,  in  order  to  aid  in  comparing  the  selected 
models.  Table  7  shows  a  summary  of  the  cost  predictions.  It  includes  the  cost 
predictions  of  F-16A  and  F-1SA. 

As  a  result,  the  table  verifies  that  the  multiple  linear  regression  model  based  on  6 
observations  gives  a  better  estimate  than  those  of  the  other  models.  This  means  that, 
since  the  6  observations  are  new  fighter  aircraft  with  a  initial  operational  capability 
year  following  1965,  a  model  based  on  new  aircraft  data  may  correctly  predict  the  cost 
of  a  new  fighter  aircraft. 

The  cost  used  in  this  thesis  is  cumulative  average  costs  of  100  units  for  total 
flvawav  cost  in  1981  millions  of  dollars. 
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TABLE  7 
SUMMARY  OF  COST  PREDICTIONS 


F-16A 

F-18A 

Actual    cost 

9.  641 

23. 968 

Point 
Prediction 

S19 

M19 

M6 

9.  025 
14. 527 
11. 917 

12. 231 
17. 853 
23. 444 

Prediction 
Intervals 

S19 
Mig 

M6 

2.  841-28.  691 

4.  145-24.  909 

5.  557-18.  277 

3.  800-39.  370 

7.  817-27.  889 

13.  860-33.  028 

where: 


Sin      =  simple  linear  regression  model  based  on  19  observations 
=  multiple  linear  regression  model  based  on  19  observations 


'19 
M, 


M6      =  multiple  linear  regression  model  based  on  6  observations 
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VI.  CONCLUSION 

This  thesis  presented  a  regression  model  of  a  CER  for  fighter  aircraft.  It  is  based 
on  19  fighter  aircraft  because  the  major  objective  of  this  thesis  is  developing  CERs  for 
fighter  aircraft  only. 

As  implied  earlier,  there  are  many  CERs  for  aircraft.  They  are  very  useful  for 
developing  new  CERs  but  are  different  from  each  other.  The  differences  mostly 
depend  upon  the  aircraft  types,  the  included  aircraft  data,  and  the  statistical  methods 
used.  However,  even  though  they  are  different  from  each  other,  their  results  are 
similar.  This  means  that  since  the  purpose  of  CERs  is  to  provide  a  reasonable  cost 
estimation  of  systems,  they  give  similar  estimates  of  a  particular  aircraft. 

As  a  result  of  this  thesis,  a  multiple  linear  regression  model  based  on  6 
observations  is  selected  as  the  best  model  to  estimate  the  costs  of  fighter  aircraft. 
Then,  it  is  a  very  meaningful  result  because  the  6  observations  are  new  fighter  aircraft 
with  an  initial  operational  capability  year  following  1965.  There  may  be  several  reasons 
for  this  result  such  as  the  limited  data  base  of  the  model  or  the  applied  statistical 
methods.  But  the  most  reasonable  cause  of  the  result  is  the  characteristics  of  the  data. 
Traditionally,  even'  new  fighter  aircraft  requires  large  development  costs.  Also,  it 
includes  developed  systems  such  as  radar,  electronic  equipments,  armament  systems, 
etc.  Undoubtedly,  those  systems  are  very  expensive.  However,  those  characteristics 
usually  were  not  considered  as  the  explanatory  variables.  This  means  that  a  model 
based  on  old  technology  may  incorrectly  estimate  the  cost  of  a  new  system  containing 
advanced  technology.  Therefore,  in  order  to  estimate  the  costs  of  modern  or  future 
fighter  aircraft,  CERs  should  be  based  on  new  aircraft  data. 

There  were  some  difficulties  in  developing  CERs  for  fighter  aircraft.  The  data 
problem  was  the  first  and  most  difficult  problem.  Sufficient  numbers  of  observations 
can  support  the  distribution  assumptions  and  reduce  the  standard  error.  Thus.  CERs 
based  on  sufficient  numbers  of  observations  may  give  better  confidence  or  prediction 
intervals  because  these  are  functions  of  the  standard  error.  However,  since  the  fighter 
aircraft  data  used  in  this  thesis  were  very  limited,  it  caused  quite  a  wide  standard  error 
and  wide  confidence  or  prediction  intervals. 
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Similarly,  accuracy  of  the  data  is  very  important.  Inaccurate  data  is  worthless 
because  it  cannot  lead  to  reliable  CERs.  Thus,  under  such  conditions,  it  is  very  hard 
to  expect  accurate  estimates.  However,  some  explanatory  variables  of  new  fighter 
aircraft  data  were  classified  such  as  the  maximum  speed  of  F- ISA.  But,  the  selected 
models  were  not  very  good  CERs. 

Additionally,  as  a  statistical  method.  OLS  has  some  problems.  OLS  is  almost 
exclusively  the  selected  regression  technique.  It  is  based  on  the  assumptions  that  the 
error  term  is  normally  distributed,  and  the  estimates  are  selected  to  minimize  the  sum 
of  the  squared  deviations  of  actual  cost  observations  from  their  estimates.  However, 
OLS  as  a  regression  method  is  quite  sensitive  to  outlying  observations.  If  the  data 
base  includes  some  unusual  observations  then  it  tends  to  give  a  poor  result.  Thus, 
there  is  a  tendency  to  discard  those  observations  that  seem  to  lie  outside  a  normal 
trend  line  in  order  to  remove  a  possible  bias  in  the  estimating  equation. 

Finally,  further  study  and  developments  of  CERS  for  fighter  aircraft  should 
consider  the  following: 

1)  Use  accurate  and  sufficient  data.  The  purpose  of  this  study  is  to  get  reliable 
CERs  which  gives  an  accurate  cost  estimate  of  the  systems.  This  is  possible 
by  using  accurate  data.  Furthermore,  sufficient  data  can  reduce  the  standard 
error  so  that  it  gives  accurate  confidence  and  prediction  intervals,  because  they 
depend  upon  the  standard  error. 

2)  Use  alternate  methods.  OLS  is  the  most  frequently  used  estimating  technique 
for  CERs,  but  it  is  not  a  perfect  technique  by  itself.  Thus,  it  is  needed  to 
support  and  compare  the  established  CERs,  but  alternate  methods,  such  as 
generalized  least  squares  or  least  absolute  value  regression,  will  also  do  that. 

Additionally,  in  order  to  estimate  the  costs  of  modern  or  future  systems,  it  is 
important  also  to  suggest  that  the  new  data  should  be  added  to  the  model  and  old  ones 
removed.  That  enables  the  model  to  be  kept  updated  and  restricted  to  fighter  aircraft 
with  similar  characteristics. 
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APPENDIX  A 
AIRCRAFT  DATA 


A/C 

Cost 

Span 

Thrust 

Weight 

Speed 

SER 

Year 

F4E 

5. 

919 

38. 

6 

35. 

80 

61. 

795 

1394 

1 

1966 

F6A 

4. 

419 

33. 

5 

14. 

50 

25. 

000 

612 

2 

1952 

F8E 

3. 

297 

35. 

2 

18. 

00 

34. 

000 

986 

2 

1961 

F9F 

. 

930 

38. 

0 

5. 

75 

16. 

450 

463 

2 

1951 

F11A 

4. 

895 

31. 

6 

10. 

50 

24. 

078 

783 

2 

1953 

F14A 

23. 

901 

64. 

1 

41. 

80 

74. 

349 

1342 

2 

1971 

F15A 

19. 

356 

42. 

8 

50. 

00 

56. 

000 

1440 

1 

1973 

F16A 

9. 

641 

31. 

0 

25. 

00 

35. 

400 

1150 

1 

1978 

F18A 

23. 

968 

37. 

5 

32. 

00 

49. 

224 

980 

2 

1979 

F84F 

5. 

943 

33. 

6 

7. 

22 

28. 

000 

579 

1 

1951 

F86F 

1. 

095 

39. 

1 

5. 

91 

20. 

611 

537 

1 

1951 

F89D 

3. 

496 

59. 

7 

14. 

40 

41. 

000 

537 

1 

1951 

F100D 

2. 

659 

38. 

8 

16. 

95 

34. 

832 

760 

1 

1954 

F101B 

7. 

419 

39. 

7 

29. 

98 

46. 

673 

1074 

1 

1956 

F102A 

9. 

206 

38. 

1 

17. 

20 

31. 

500 

726 

1 

1953 

F104C 

4. 

612 

21. 

9 

15. 

80 

23. 

590 

1276 

1 

1956 

F105D 

10. 

637 

34. 

9 

26. 

50 

52. 

546 

1223 

1 

1958 

F106A 

11. 

255 

38. 

3 

24. 

50 

38. 

250 

1342 

1 

1957 

F111A 

23. 

510 

63. 

0 

37. 

00 

91. 

500 

1452 

1 

1965 

note  : 

A/C  =  type  of  fighter  aircraft 

Cost  =  cumulative  Average  Costs  (CAC)  of  100  units  for  total 
flyaway  cost  in  constant  1981  dollars  (millions) 

Span  =  wing  span  (  ft) 

Thrust  =  (total  maximum  engine  thrust)/1000  (lb) 

Weight  =  (maximum  take-off  gross  weight)/1000  (lb) 

Speed  =  maximum  speed  at  best  altitude  (kts) 

SER  =  identification  of  service  (  'Tavy  =  2,  Air  Force  =  1  ) 

Year  =  year  of  initial  operational  capability 

Source  :  References  6-11 
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APPENDIX  B 
PRICE  INDEX 


DOD  OUTLAY 

COMPOSITE   INDEXES 

FISCAL 

ESCALATION 

YEAR 

INDEX 

APN 

APA 

APAF 

1950 

26.  44 

28.  68 

29.  00 

28.  80 

1951 

29.  03 

29.  13 

29.  11 

29.  19 

1952 

29.  15 

29.  12 

29.  24 

29.  09 

1953 

28.  91 

29.  46 

30.  09 

29.  80 

1954 

28.  45 

30.  53 

31.  29 

30.  80 

1955 

30.  34 

31.  79 

32.  64 

32.  17 

1956 

30.  93 

33.  11 

33.  34 

33.  21 

1957 

33.  62 

33.  36 

33.  10 

33.  28 

1958 

33.  60 

32.  92 

32.  76 

32.  84 

1959 

33.  06 

32.  69 

32.  77 

32.  68 

1960 

32.  39 

32.  84 

32.  83 

32.  89 

1961 

33.  00 

32.  79 

32.  63 

32.  68 

1962 

33.  01 

32.  62 

32.  82 

32.  74 

1963 

31.  93 

33.  06 

33.  68 

33.  30 

1964 

32.  54 

34.  21 

35.  40 

34.  72 

1965 

32.  88 

36.  25 

37.  57 

36.  81 

1966 

35.  47 

38.  40 

39.  46 

38.  87 

1967 

37.  75 

40.  13 

40.  98 

40.  51 

1968 

39.  67 

41.  57 

42.  57 

42.  00 

1969 

40.  73 

43.  30 

44.  51 

43.  82 

1970 

42.  32 

45.  39 

46.  70 

45.  97 

1971 

44.  28 

47.  70 

49.  33 

48.  41 

1972 

46.  14 

50.  61 

53.  09 

51.  68 

1973 

48.  38 

57.  41 

58.  02 

56.  17 

1974 

52.  47 

60.  69 

62.  86 

61.  32 

1975 

58.  44 

64.  83 

66.  98 

66.  06 

1976 

63.  79 

69.  53 

71.  10 

70.  84 

1977 

68.  67 

77.  60 

78.  73 

79.  00 

1978 

73.  57 

85.  45 

87.  05 

86.  77 

1979 

80.  16 

96.  49 

96.  53 

97.  33 

1980 

89.  62 

106.  54 

107. 62 

106. 30 

1981 

100. 00 

117. 62 

119. 60 

118. 20 

1982 

114.  30 

126.  36 

128. 40 

126.  32 

1983 

121.  73 

134.  62 

136.  34 

134.  64 

1984 

130. 12 

143. 06 

144. 80 

143. 08 

1985 

138. 41 

151. 68 

153. 45 

151. 69 

1986 

146. 87 

160. 60 

162. 45 

160. 59 

1987 

155. 46 

169. 99 

171. 95 

169. 99 

1988 

164. 55 

179. 94 

182. 01 

179. 93 

1989 

174. 18 

190. 47 

192. 66 

190. 46 

Source 


Reference  6 
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APPENDIX  C 
SIMPLE  AND  MULTIPLE  LINEAR  REGRESSION  MODELS 

1)    Models   with   3    variables 


-> 

PROB 

PROB 

PROB 

PROB 

PROB 

VAR' 

DATA 

FUNCTION 

R2 

> 

> 

> 

> 

> 

F 

1*1,11 

IV 

1^1 

Ith4l 

19 

linear 

0.  69 

0.  01 

0.  29 

0.  73 

0.  09 

0.  46 

span 

thrust 

obs. 

loglinear 

0.  73 

0.  01 

0.  38 

0.  45 

0.  24 

0.  25 

weight 

6 

linear 

0.  50 

0.  64 

0.  95 

0.  43 

0.  85 

0.  64 

obs. 

loglinear 

0.  54 

0.  60 

0.  66 

0.  34 

0.  80 

0.  47 

19 

linear 

0.  68 

0.  01 

0.  33 

0.  27 

0.  04 

0.  99 

span 

thrust 

obs. 

loglinear 

0.  70 

0.  01 

0.  52 

0.  77 

0.  06 

0.  76 

speed 

6 

linear 

0.  87 

0.  19 

0.  21 

0.  10 

0.  22 

0.  12 

obs. 

loglinear 

0.  77 

0.  32 

0.  25 

0.  20 

0.  41 

0.  20 

19 

linear 

0.  71 

0.  01 

0.  10 

0.  23 

0.  01 

0.  23 

span 

thrust 

obs. 

loglinear 

0.  70 

0.  01 

0.  30 

0.  89 

0.  01 

0.  76 

ser 

6 

linear 

0.  65 

0.  47 

0.  88 

0.  47 

0.  73 

0.  37 

obs. 

loglinear 

0.  55 

0.  59 

0.  73 

0.  56 

0.  84 

0.  46 

19 

linear 

0.  75 

0.  01 

0.  06 

0.  07 

0.  15 

0.  06 

span 
thrust 

obs. 

loglinear 

0.  72 

0.  01 

0.  40 

0.  78 

0.  01 

0.  41 

year 

6 

linear 

0.  80 

0.  28 

0.  19 

0.  13 

0.  64 

0.  19 

obs. 

loglinear 

0.  85 

0.  21 

0.  12 

0.  11 

0.  68 

0.  12 

19 

linear 

0.  68 

0.  01 

0.  81 

0.  63 

0.  02 

0.  68 

span 

thrust 

obs. 

loglinear 

0.  73 

0.  01 

0.  38 

0.  45 

0.  01 

0.  25 

t/w 

6 

linear 

0.  53 

0.  61 

0.  60 

0.  39 

0.  71 

0.  58 

obs. 

loglinear 

0.  54 

0.  59 

0.  65 

0.  34 

0.  59 

0.  47 

19 

linear 

0.  63 

0.  01 

0.  55 

0.  99 

0.  15 

0.  67 

span 

obs. 

loglinear 

0.  70 

0.  01 

0.  53 

0.  41 

0.  05 

0.  67 

weight 

speed 

6 

linear 

0.  67 

0.  44 

0.  42 

0.  40 

0.  93 

0.  40 

obs. 

loglinear 

0.  69 

0.  42 

0.  47 

0.  30 

0.  67 

0.  41 

19 

linear 

0.  68 

0.  01 

0.  51 

0.  50 

0.  01 

0.  15 

span 

obs. 

loglinear 

0.  72 

0.  01 

0.  47 

0.  06 

0.  01 

0.  32 

weight 

ser 

6 

linear 

0.  62 

0.  50 

0.  91 

0.  69 

0.  95 

0.  49 

obs. 

loglinear 

0.  59 

0.  54 

0.  77 

0.  47 

0.  64 

0.  61 

49 


19 

linear 

0. 

76 

0. 

01 

0. 

01 

0. 

50 

0. 

10 

0. 

01 

span 

obs. 

loglinear 

0. 

73 

0. 

01 

0. 

19 

0. 

23 

0. 

01 

0. 

19 

weight 

year 

6 

linear 

0. 

95 

0. 

08 

0. 

05 

0. 

79 

0. 

12 

0. 

05 

obs. 

loglinear 

0. 

92 

0. 

11 

0. 

08 

0. 

73 

0. 

27 

0. 

08 

19 

linear 

0. 

69 

0. 

01 

0. 

11 

0. 

60 

0. 

02 

0. 

11 

span 

obs. 

loglinear 

0. 

73 

0. 

01 

0. 

38 

0. 

45 

0. 

01 

0. 

24 

weight 

t/w 

6 

linear 

0. 

63 

0. 

63 

0. 

84 

0. 

43 

0. 

76 

0. 

80 

obs. 

loglinear 

0. 

54 

0. 

59 

0. 

65 

0. 

34 

0. 

59 

0. 

79 

19 

linear 

0. 

63 

0. 

01 

0. 

01 

0. 

02 

0. 

01 

0. 

16 

span 

obs. 

loglinear 

0. 

63 

0. 

01 

0. 

00 

0. 

12 

0. 

01 

0. 

60 

speed 

ser 

6 

linear 

0. 

67 

0. 

44 

0. 

62 

0. 

37 

0. 

63 

0. 

90 

obs. 

loglinear 

0. 

66 

0. 

46 

0. 

51 

0. 

31 

0. 

48 

0. 

85 

19 

linear 

0. 

74 

0. 

01 

0. 

01 

0. 

01 

0. 

22 

0. 

01 

span 

obs. 

loglinear 

0. 

67 

0. 

01 

0. 

13 

0. 

16 

0. 

02 

0. 

14 

speed 

year 

6 

linear 

0. 

78 

0. 

31 

0. 

44 

0. 

12 

0. 

85 

0. 

43 

obs. 

loglinear 

0. 

83 

0. 

23 

0. 

27 

0. 

08 

0. 

93 

0. 

27 

19 

linear 

0. 

61 

0. 

01 

0. 

01 

0. 

01 

0. 

14 

0. 

25 

span 

obs. 

loglinear 

0. 

64 

0. 

01 

0. 

01 

0. 

08 

0. 

03 

0. 

39 

speed 

t/w 

6 

linear 

0. 

83 

0. 

24 

0. 

80 

0. 

09 

0. 

17 

0. 

29 

obs. 

loglinear 

0. 

81 

0. 

27 

0. 

27 

0. 

10 

0. 

19 

0. 

32 

19 

linear 

0. 

71 

0. 

01 

0. 

01 

0. 

01 

0. 

90 

0. 

01 

span 

obs. 

loglinear 

0. 

55 

0. 

01 

0. 

01 

0. 

28 

0. 

42 

0. 

01 

ser 

year 

6 

linear 

0. 

77 

0. 

32 

0. 

37 

0. 

20 

0. 

90 

0. 

36 

obs. 

loglinear 

0. 

85 

0. 

21 

0. 

17 

0. 

11 

0. 

69 

0. 

17 

19 

linear 

0. 

57 

0. 

01 

0. 

01 

0. 

01 

0. 

51 

0. 

01 

span 

obs. 

loglinear 

0. 

49 

0. 

01 

0. 

29 

0. 

03 

0. 

66 

0. 

01 

ser 

t/w 

6 

linear 

0. 

68 

0. 

43 

0. 

67 

0. 

30 

0. 

38 

0. 

61 

obs. 

loglinear 

0. 

59 

0. 

54 

0. 

62 

0. 

36 

0. 

49 

0. 

64 

19 

linear 

0. 

72 

0. 

01 

0. 

01 

0. 

01 

0. 

01 

0. 

49 

span 

obs. 

loglinear 

0. 

59 

0. 

01 

0. 

07 

0. 

11 

0. 

08 

0. 

15 

year 
t/w 

6 

linear 

0. 

77 

0. 

32 

0. 

24 

0. 

13 

0. 

25 

0. 

90 

obs. 

loglinear 

0. 

83 

0. 

23 

0. 

16 

0. 

10 

0. 

16 

0. 

96 

19 

linear 

0. 

70 

0. 

01 

0. 

58 

0. 

09 

0. 

16 

0. 

56 

thrust 

obs. 

loglinear 

0. 

72 

0. 

01 

0. 

33 

0. 

24 

0. 

36 

0. 

74 

weight 

speed 

6 

linear 

0. 

82 

0. 

25 

0. 

19 

0. 

19 

0. 

14 

0. 

13 

obs. 

loglinear 

0. 

58 

0. 

56 

0. 

35 

0. 

44 

0. 

44 

0. 

31 

50 


19 

linear 

0. 

73 

0. 

01 

0. 

06 

0. 

07 

0. 

11 

0. 

15 

thrust 

obs. 

loglinear 

0. 

72 

0. 

01 

0. 

04 

0. 

06 

0. 

34 

0. 

63 

weight 

ser 

6 

linear 

0. 

63 

0. 

50 

0. 

89 

0. 

68 

0. 

52 

0. 

30 

obs. 

loglinear 

0. 

47 

0. 

67 

0. 

82 

0. 

74 

0. 

77 

0. 

41 

19 

linear 

0. 

75 

0. 

01 

0. 

07 

0. 

79 

0. 

06 

0. 

07 

thrust 

obs. 

loglinear 

0. 

73 

0. 

01 

0. 

34 

0. 

23 

0. 

31 

0. 

34 

weight 

year 

6 

linear 

0. 

99 

0. 

02 

0. 

01 

0. 

11 

0. 

01 

0. 

01 

obs. 

loglinear 

0. 

92 

0. 

12 

0. 

05 

0. 

80 

0. 

06 

0. 

05 

19 

linear 

0. 

69 

0. 

01 

0. 

68 

0. 

55 

0. 

48 

0. 

97 

thrust 

obs. 

loglinear 

0. 

72 

0. 

01 

0. 

04 

0. 

05 

0. 

37 

weight 

t/w 

6 

linear 

0. 

29 

0. 

84 

0. 

86 

0. 

91 

0. 

78 

0. 

85 

obs. 

loglinear 

0. 

20 

0. 

71 

0. 

80 

0. 

73 

0. 

72 

• 

19 

linear 

0. 

68 

0. 

01 

0. 

55 

0. 

01 

0. 

85 

0. 

26 

thrust 

obs. 

loglinear 

0. 

70 

0. 

01 

0. 

48 

0. 

02 

0. 

80 

0. 

72 

speed 

ser 

6 

linear 

0. 

51 

0. 

63 

0. 

98 

0. 

67 

0. 

99 

0. 

48 

obs. 

loglinear 

0. 

46 

0. 

69 

0. 

88 

0. 

61 

0. 

85 

0. 

65 

19 

linear 

0. 

69 

0. 

01 

0. 

19 

0. 

05 

0. 

59 

0. 

19 

thrust 

obs. 

loglinear 

0. 

71 

0. 

01 

0. 

43 

0. 

05 

0. 

89 

0. 

43 

speed 

year 

6 

linear 

0. 

40 

0. 

75 

0. 

69 

0. 

37 

0. 

51 

0. 

69 

obs. 

loglinear 

0. 

38 

0. 

38 

0. 

96 

0. 

39 

0. 

64 

0. 

97 

19 

linear 

0. 

68 

0. 

01 

0. 

55 

0. 

01 

0. 

79 

0. 

27 

thrust 

obs. 

loglinear 

0. 

72 

0. 

01 

0. 

33 

0. 

01 

0. 

74 

0. 

36 

speed 

t/w 

6 

linear 

0. 

71 

0. 

40 

0. 

20 

0. 

16 

0. 

22 

0. 

25 

obs. 

loglinear 

0. 

58 

0. 

56 

0. 

35 

0. 

25 

0. 

31 

0. 

44 

19 

linear 

0. 

70 

0. 

01 

0. 

31 

0. 

02 

0. 

37 

0. 

31 

thrust 

obs. 

loglinear 

0. 

71 

0. 

01 

0. 

46 

0. 

01 

0. 

95 

0. 

46 

ser 

year 

6 

linear 

0. 

51 

0. 

63 

0. 

98 

0. 

54 

0. 

37 

0. 

98 

obs. 

loglinear 

0. 

46 

0. 

68 

0. 

81 

0. 

52 

0. 

50 

0. 

81 

19 

linear 

0. 

72 

0. 

01 

0. 

79 

0. 

01 

0. 

19 

0. 

20 

thrust 

obs. 

loglinear 

0. 

72 

0. 

01 

0. 

04 

0. 

01 

0. 

63 

0. 

34 

ser 

t/w 

6 

linear 

0. 

57 

0. 

56 

0. 

76 

0. 

44 

0. 

34 

0. 

65 

obs. 

loglinear 

0. 

47 

0. 

67 

0. 

82 

0. 

52 

0. 

41 

0. 

77 

19 

linear 

0. 

74 

0. 

01 

0. 

08 

0. 

01 

0. 

08 

0. 

10 

thrust 

obs. 

loglinear 

0. 

73 

0. 

01 

0. 

34 

0. 

01 

0. 

34 

0. 

31 

year 
t/w 

6 

linear 

0. 

91 

0. 

13 

0. 

06 

0. 

05 

0. 

06 

0. 

06 

obs. 

loglinear 

0. 

92 

0. 

12 

0. 

05 

0. 

05 

0. 

05 

0. 

06 

51 


19 

linear 

0. 

68 

0. 

01 

0. 

06 

0. 

01 

0. 

35 

0.  13 

weight 

obs. 

loglinear 

0. 

70 

0. 

01 

0. 

01 

0. 

02 

0. 

11 

0.  42 

speed 

ser 

6 

linear 

0. 

60 

0. 

53 

0. 

78 

0. 

49 

0. 

83 

0.  54 

obs. 

loglinear 

0. 

45 

0. 

69 

0. 

85 

0. 

62 

0. 

85 

0.  68 

19 

linear 

0. 

75 

0. 

01 

0. 

01 

0. 

01 

0. 

65 

0.  02 

weight 

obs. 

loglinear 

0. 

72 

0. 

01 

0. 

19 

0. 

03 

0. 

33 

0.  19 

speed 

year 

6 

linear 

0. 

96 

0. 

06 

0. 

04 

0. 

02 

0. 

49 

0.  04 

obs. 

loglinear 

0. 

95 

0. 

08 

0. 

04 

0. 

03 

0. 

37 

0.  04 

19 

linear 

0. 

70 

0. 

01 

0. 

04 

0. 

01 

0. 

41 

0.  09 

weight 

obs. 

loglinear 

0. 

72 

0. 

01 

0. 

33 

0. 

01 

0. 

74 

0.  24 

speed 

t/w 

6 

linear 

0. 

92 

0. 

12 

0. 

97 

0. 

04 

0. 

06 

0.  08 

obs. 

loglinear 

0. 

58 

0. 

56 

0. 

35 

0. 

25 

0. 

31 

0.  44 

19 

linear 

0. 

77 

0. 

01 

0. 

02 

0. 

01 

0. 

33 

0.  02 

weight 

obs. 

loglinear 

0. 

71 

0. 

01 

0. 

10 

0. 

01 

0. 

92 

0.  10 

ser 

year 

6 

linear 

0. 

94 

0. 

08 

0. 

07 

0. 

05 

0. 

94 

0.  0.7 

obs. 

loglinear 

0. 

99 

0. 

01 

0. 

01 

0. 

01 

0. 

03 

0.  01 

19 

linear 

0. 

73 

0. 

01 

0. 

02 

0. 

01 

0. 

14 

0.  09 

weight 

obs. 

loglinear 

0. 

72 

0. 

01 

0. 

04 

0. 

01 

0. 

63 

0.  06 

ser 

t/w 

6 

linear 

0. 

69 

0. 

42 

0. 

57 

0. 

29 

0. 

24 

0.  50 

obs. 

loglinear 

0. 

47 

0. 

67 

0. 

82 

0. 

52 

0. 

41 

0.  74 

19 

linear 

0. 

75 

0. 

01 

0. 

06 

0. 

01 

0. 

06 

0.  87 

weight 

obs. 

loglinear 

0. 

73 

0. 

01 

0. 

34 

0. 

01 

0. 

34 

0.  23 

year 
t/w 

6 

linear 

0. 

99 

0. 

02 

0. 

01 

0. 

01 

0. 

01 

0.  13 

obs. 

loglinear 

0. 

92 

0. 

12 

0. 

05 

0. 

05 

0. 

05 

0.  79 

19 

linear 

0. 

61 

0. 

01 

0. 

04 

0. 

19 

0. 

54 

0.  04 

speed 

obs. 

loglinear 

0. 

62 

0. 

01 

0. 

12 

0. 

05 

0. 

99 

0.  13 

ser 

year 

6 

linear 

0. 

48 

0. 

66 

0. 

79 

0. 

61 

0. 

30 

0.  79 

obs. 

loglinear 

0. 

41 

0. 

73 

0. 

72 

0. 

61 

0. 

37 

0.  72 

19 

linear 

0. 

47 

0. 

02 

0. 

23 

0. 

01 

0. 

20 

0.  84 

speed 

obs. 

loglinear 

0. 

56 

0. 

01 

0. 

04 

0. 

01 

0. 

63 

0.  92 

ser 

t/w 

6 

linear 

0. 

46 

0. 

68 

0. 

92 

0. 

63 

0. 

34 

0.  96 

obs. 

loglinear 

0. 

36 

0. 

78 

0. 

82 

0. 

74 

0. 

42 

0.  93 

19 

linear 

0. 

62 

0. 

01 

0. 

01 

0. 

13 

0. 

01 

0.  29 

speed 

obs. 

loglinear 

0. 

63 

0. 

01 

0. 

10 

0. 

05 

0. 

11 

0.  81 

year 
t/w 

6 

linear 

0. 

26 

0. 

87 

0. 

52 

0. 

57 

0. 

52 

0.  49 

obs. 

loglinear 

0. 

33 

0. 

81 

0- 

45 

0. 

52 

0. 

45 

0.  44 

52 


2)    Models   with   2    variables 


VAR' 

DATA 

FUNCTION 

R2 

PROB 
> 

F 

PROB     PROB     PROB 

ithii      Ko.i      k,i 

span 
thrust 

19 
obs. 

6 
obs. 

linear 
loglinear 

linear 
loglinear 

0.  68 
0.  70 

0.  43 
0.  37 

0.  01 

0.  01 

0.  43 
0.  50 

0.  14     0. 24     0. 01 
0.  29     0.  87     0.  01 

0.92     0.31     0.84 
0.  66     0.  39     0.  92 

span 
weight 

19 
obs. 

6 
obs. 

linear 
loglinear 

linear 
loglinear 

0.  63 
0.  70 

0.  49 
0.  52 

0.  01 
0.  01 

0.  36 
0.  33 

0.  65     0.  66     0. 01 
0.  48     0.  09     0.  01 

0.91     0.30     0.56 
0.65     0.23     0.39 

span 
speed 

19 
obs. 

6 
obs. 

linear 
loglinear 

linear 
loglinear 

0.  58 
0.  62 

0.  67 
0.  65 

0.  01 
0.  01 

0.  19 
0.  21 

0.02     0.02     0.01 
0.01     0.11     0.01 

0.27     0.09     0.23 
0.28     0.10     0.22 

span 
ser 

19 
obs. 

6 
obs. 

linear 
loglinear 

linear 
loglinear 

0.  27 
0.  10 

0.  62 
0.  54 

0.  08 
0.  42 

0.  23 
0.  32 

0.37     0.03     0.69 
0.49     0.21     0.75 

0.89     0.25     0.29 
0.73     0.31     0.37 

span 
year 

19 
obs. 

6 
obs. 

linear 
loglinear 

linear 
loglinear 

0.  71 
0.  52 

0.  77 
0.  83 

0.  01 
0.  01 

0.  11 
0.  07 

0.01     0.01     0.01 
0.01     0.27     0.01 

0.  12     0.  05     0.  12 
0.  06     0.  03     0.  06 

span 
t/w 

19 
obs. 

6 
obs. 

linear 
loglinear 

linear 
loglinear 

0.  55 
0.  49 

0.  48 
0.  45 

0.  01 
0.  01 

0.  37 
0.  40 

0.01     0.01     0.01 
0.27     0.03     0.01 

0.  63     0.  20     0.  58 
0.  46     0.  22     0.  54 

thrust 
weight 

19 
obs. 

6 
obs. 

linear 
loglinear 

linear 

loglinear 

0.  69 
0.  72 

0.  63 
0.  20 

0.  01 
0.  01 

0.  63 
0.  72 

0.  16     0.  08     0.  16 
0.04     0.05     0.37 

0.  98     0.  72     0.  53 
0.81     0.74     0.72 

thrust 
speed 

19 
obs. 

6 
obs. 

linear 
loglinear 

linear 
loglinear 

0.  65 
0.  70 

0.  34 
0.  38 

0.  01 
0.  01 

0.  54 
0.  49 

0.92     0.01     0.61 
0.51     0.01     0.85 

0.45     0.31     0.43 
0.42     0.28     0.37 

53 


thrust 

19 
obs. 

linear 
loglinear 

0. 
0. 

68 
70 

0. 
0. 

01 
01 

0. 
0. 

22 

01 

0. 
0. 

01 
01 

0. 
0. 

21 

75 

ser 

6 
obs. 

linear 
loglinear 

0. 
0. 

51 
44 

0. 
0. 

34 

41 

0. 
0. 

94 
85 

0. 
0. 

41 

45 

0. 
0. 

23 

30 

thrust 

19 
obs. 

linear 
loglinear 

0. 
0. 

69 

71 

0. 
0. 

01 
01 

0. 
0. 

18 

41 

0. 

0. 

02 

01 

0. 
0. 

18 

41 

year 

6 
obs. 

linear 
loglinear 

0. 
0. 

21 
29 

0. 
0. 

70 
59 

0. 
0. 

66 

50 

0. 
0. 

45 
36 

0. 
0. 

66 
50 

thrust 

19 
obs. 

linear 
loglinear 

0. 
0. 

68 
72 

0. 
0. 

01 

01 

0. 
0. 

58 

04 

0. 
0. 

01 
01 

0. 
0. 

22 

37 

t/w 

6 
obs. 

linear 
loglinear 

0. 
0. 

25 

20 

0. 
0. 

65 

72 

0. 
0. 

63 

81 

0. 
0. 

42 

48 

0. 
0. 

57 
72 

weight 

19 

obs. 

linear 
loglinear 

0. 
0. 

63 

69 

0. 
0. 

01 
01 

0. 
0. 

17 
01 

0. 
0. 

01 
02 

0. 
0. 

53 
12 

speed 

6 
obs. 

linear 
loglinear 

0. 

0. 

49 
39 

0. 

0. 

36 

48 

0. 
0. 

27 

38 

0. 
0. 

19 
27 

0. 
0. 

30 

37 

weight 

19 

obs. 

linear 
loglinear 

0. 
0. 

66 
65 

0. 
0. 

01 
01 

0. 
0. 

08 
01 

0. 
0. 

01 
01 

0. 
0. 

17 
54 

ser 

6 
obs. 

linear 
loglinear 

0. 
0. 

58 

44 

0. 
0. 

27 
42 

0. 
0. 

78 
98 

0. 
0. 

30 

46 

0. 
0. 

21 

31 

weight 

19 
obs. 

linear 
loglinear 

0. 
0. 

75 

71 

0. 
0. 

01 
01 

0. 
0. 

01 
07 

0. 
0. 

01 
01 

0. 
0. 

01 
07 

year 

6 
obs. 

linear 
loglinear 

0. 

94 
92 

0. 
0. 

01 
02 

0. 
0. 

01 

01 

0. 
0. 

01 

01 

0. 

0. 

01 

01 

weight 

19 
obs. 

linear 
loglinear 

0. 
0. 

68 
72 

0. 
0. 

01 
01 

0. 
0. 

04 

04 

0. 
0. 

01 
01 

0. 
0. 

10 
05 

t/w 

6 
obs. 

linear 
loglinear 

0. 
0. 

28 

20 

0. 
0. 

61 
72 

0. 
0. 

81 
81 

0. 
0. 

39 

48 

0. 
0. 

68 

74 

speed 

19 
obs. 

linear 
loglinear 

0. 
0. 

47 
56 

0. 
0. 

01 
01 

0. 

0. 

13 
01 

0. 
0. 

01 
01 

0. 
0. 

19 

59 

ser 

6 
obs. 

linear 
loglinear 

0. 
0. 

46 
36 

0. 
0. 

40 
51 

0. 
0. 

84 
75 

0. 
0. 

52 

64 

0. 
0. 

21 
29 

speed 

19 
obs. 

linear 
loglinear 

0. 
0. 

59 
62 

0. 
0. 

01 

01 

0. 
0. 

01 
09 

0. 
0. 

23 

03 

0. 
0. 

01 

09 

year 

6 
obs. 

linear 
loglinear 

0. 
0. 

08 

04 

0. 

0. 

99 

95 

0. 
0. 

93 
82 

0. 
0. 

99 
92 

0. 
0. 

93 
82 
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speed 

19 
obs. 

linear 
loglinear 

0. 
0. 

41 
55 

0. 
0. 

02 
01 

0. 
0. 

40 

04 

0. 

0. 

02 

01 

0.  99 
0.  82 

t/w 

6 
obs. 

linear 
loglinear 

0. 
0. 

04 

04 

0. 
0. 

94 
94 

0. 
0. 

50 

70 

0. 
0. 

87 
82 

0.  75 
0.  78 

ser 

19 
obs. 

linear 
loglinear 

0. 
0. 

56 
51 

0. 
0. 

01 
01 

0. 
0. 

01 
01 

0. 
0. 

95 

41 

0.  01 
0.  01 

year 

6 
obs. 

linear 
loglinear 

0. 
0. 

39 

31 

0. 
0. 

47 
57 

0. 
0. 

73 
91 

0. 
0. 

26 
35 

0.  74 
0.  91 

ser 

19 
obs. 

linear 
loglinear 

0. 
0. 

19 
31 

0. 
0. 

19 
05 

0. 
0. 

72 
01 

0. 
0. 

63 

71 

0.  08 
0.  02 

t/w 

6 
obs. 

linear 
loglinear 

0. 
0. 

38 
32 

0. 
0. 

49 
56 

0. 
0. 

32 

03 

0. 
0. 

29 
33 

0.  83 
0.  82 

year 

19 
obs. 

linear 
loglinear 

0. 
0. 

56 
51 

0. 
0. 

01 
01 

0. 
0. 

01 
02 

0. 
0. 

01 
02 

0.  65 
0.  39 

t/w 

6 
obs. 

linear 
loglinear 

0. 
0. 

09 

14 

0. 

0. 

86 

80 

0. 
0. 

70 

58 

0. 

0. 

69 
57 

0.  63 
0.  58 
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3)    Models   with    1   variable 


1 

PROB 

PROB 

PROB 

VAR' 

DATA 

FUNCTION 

R- 

> 

> 

> 

F 

Ithil 

Ithil 

19 

linear 

0. 

27 

0.  02 

0.  39 

0.  02 

obs. 

loglinear 

0. 

10 

0.20 

0.  47 

0.  20 

span 

6 

linear 

0. 

42 

0.  17 

0.  95 

0.  17 

obs. 

loglinear 

0. 

37 

0.  20 

0.  58 

0.  20 

19 

linear 

0. 

65 

0.  01 

0.  43 

0.  01 

obs. 

loglinear 

0. 

70 

0.  01 

0.  01 

0.  01 

thrust 

6 

linear 

0. 

15 

0.  45 

0.  80 

0.  45 

obs. 

loglinear 

0. 

16 

0.  44 

0.  85 

0.  44 

19 

linear 

0. 

62 

0.  01 

0.  19 

0.  01 

obs. 

loglinear 

0. 

64 

0.  01 

0.  01 

0.  01 

weight 

6 

linear 

0. 

23 

0.  34 

0.  64 

0.  34 

obs. 

loglinear 

0. 

16 

0.  43 

0.  96 

0.  43 

19 

linear 

0. 

41 

0.  01 

0.  29 

0.  01 

obs. 

loglinear 

0. 

55 

0.  01 

0.  01 

0.  01 

speed 

6 

linear 

0. 

01 

0.  89 

0.  48 

0.  90 

obs. 

loglinear 

0. 

01 

0.  84 

0.  70 

0.  84 

19 

linear 

0. 

01 

0.  72 

0.  19 

0.  72 

obs. 

loglinear 

0. 

01 

0.  76 

0.  01 

0.  76 

ser 

6 

linear 

0. 

36 

0.  20 

0.  58 

0.  20 

obs. 

loglinear 

0. 

31 

0.  25 

0.  01 

0.  25 

19 

linear 

0. 

56 

0.  01 

0.  01 

0.  01 

obs. 

loglinear 

0. 

49 

0.  01 

0.  01 

0.  01 

year 

6 

linear 

0. 

01 

0.  87 

0.  88 

0.  87 

obs. 

loglinear 

0. 

03 

0.  76 

0.  76 

0.  76 

19 

linear 

0. 

17 

0.  08 

0.  80 

0.  08 

obs. 

loglinear 

0. 

30 

0.  01 

0.  01 

0.  01 

t/w 

6 

linear 

0. 

03 

0.  73 

0.  21 

0.  73 

obs. 

loglinear 

0. 

02 

0.  78 

0.  01 

0.  78 

56 


APPENDIX  D 
CHOW  TEST 

Hn  :  The  models  based  on  6  and  13  observations  came  from  the  same  population. 


model  1 

model  2 

model  3 

number  of 
observations 
(n) 

6 

13 

19 

number  of 
parameters 
(k) 

3 

3 

3 

residual  sum 
of  squares 
fRSS) 

17. 7728 

78. 0055 

269. 49 

(RSS3  -  (RSSj  +  RSS2))(n  -  2k) 
(RSS.   +  RSSO(k) 


~  F(k,n3-2k) 


Thus, 


F  = 


(269.49  -  (17.772S  +  78.0055 ))( 19  -  6) 
(17.772S  4-  7S.0055)(3) 


=  7.859 


Then,  the  critical  value  is 


Fnqc(3,13)  =  3.415  <  7.859 


0.95 


Therefore,  the  null  hypothesis  that  the  model  based  on  6  and  13  observations  came 
from  the  same  population  is  rejected. 
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